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(57) Abstract 

A method of locating an inhibitory/instability sequence or sequences within the coding region of an mRNA and modifying 
the gene encoding that mRNA to remove these inhibitory/instability sequences by making clustered nucleotide substitutions with- 
out altering the coding capacity of the gene is disclosed. Constructs containing these mutated genes and host cells containing 
these constructs are also disclosed. The method and constructs are exemplified by the mutation of a Human Immunodeficiency 
Virus- 1 Rev-dependent gag gene to a Rev-independent gag gene. Constructs useful in locating inhibitory/instabibty sequences 
within either the coding region or the 3' untranslated region of an mRNA are also disclosed. The exemplified constructs of the 
invention may also be useful in HIV-1 immunotherapy and immunoprophylaxis. 
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METHOD OP ELIMINATING 
INHIBITORY/ INSTABILITY REGIONS OP mRNA 

This application is a continuation-in-part of 
U.S. Serial No. 07/858,747/ filed March 27, 1992. 

I. TECHNICAL FIELD 

The invention relates to methods of increasing 
the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/instability 
sequences (INS) in the coding region of the gene which 
prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
genes mutated in, accordance with these methods and host 
cells containing these constructs. 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA without changing its protein coding capacity. 

2q These methods are useful for allowing or increasing the 

expression of genes which would otherwise not be expressed 
or which would be poorly expressed, because of the presence 
of INS regions in the mRNA transcript. Thus, the methods, 
constructs and host cells of the invention are useful for 

25 increasing the amount of protein produced by any gene 
which encodes an mRNA transcript which contains an INS. 

The methods, constructs and host cells of the 
invention are useful for increasing the amount of protein 
produced from genes such as those coding for growth 

3Q , factors, interferons, interleukins , the fos proto- oncogene 
protein, and HIV-1 gag and env, for example. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 
immunoprophylaxis, e.g., as a vaccine, or in genetic 

35 therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral or other 
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expression vectors or they may also be directly injected 
into tissue cells resulting in efficient expression of the 
encoded protein or protein fragment. These constructs may 
also be used for in -vivo or in -vitro gene replacement, 
e.g., by homologous recombination with a target gene in- 
situ. 

The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or define the boundaries of 
inhibitory/ instability sequences in any mRNA, methods of 
using these constructs, and host cells containing these 
constructs. Once the INS regions of the mRNAs have been 
located and/or further defined, the nucleotide sequences 
encoding these INS regions can be mutated in accordance 
with the method of this invention to allow the increase in 
15 stability and/or utilization of the mRNA and, therefore, 
allow an increase in the amount of protein produced from 
expression vectors encoding the mutated mRNA. 



10 



20 



II. BACKGROUND ART 

While much work has been devoted to studying 
transcriptional regulatory mechanisms, it has become 
increasingly clear that post -transcriptional processes 
also modulate the amount and utilization of RNA produced 
from a given gene. These post -transcriptional processes 
25 include nuclear post -transcriptional processes (e.g., 
splicing, polyadenylation, and transport) as well as 
cytoplasmic RNA degradation. All these processes 
contribute to the final steady- state level of a particular 
transcript. These points of regulation create a more 
flexible regulatory system than any one process could 
produce alone. For example, a short-lived message is less 
abundant than a stable one, even if it is highly 
transcribed and efficiently processed. The efficient rate 
of synthesis ensures that the message reaches the 
35 cytoplasm and is translated, but the rapid rate of 
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degradation guarantees that the mRNA does not accumulate 
to too high a level. Many RNAs, for example the mRNAS for 
proto- oncogenes c-my£ and c-fos, have been studied which 
exhibit this kind of regulation in that they are expressed 
at very low levels, decay rapidly and are modulated 
5 quickly and transiently under different conditions. See . 
M. Hentze, Biochim. Biophys. Acta 1090 :281-292 (1991) for 
a review. The rate of degradation of many of these mRNAs 
has been shown to be a function of the presence of one or 
more instability/inhibitory sequences within the mRNA 
10 itself. 

Some cellular genes which encode unstable or 
short-lived mRNAs have been shown to contain A and U-rich 
(AU-rich) INS within the 3' untranslated region (3' UTR) 
of the transcript mRNA. These cellular genes include the 
genes encoding granulocyte-monocyte colony stimulating 
factor (GM-CSF), whose AU-rich 3 'UTR sequences (containing 
8 copies of the sequence motif AUUUA) are more highly 
conserved between mice and humans than the protein 
encoding sequences themselves (93% versus 65%) (G. Shaw, 
and R. Kamen, Cell A£:659-667 (1986)) and the myc proto- 
oncogene (c-mvc.) , whose untranslated regions are conserved 
throughout evolution (for example, 81% for- man and mouse)' 
(M. Cole and S.E. Mango, Enzyme 11:167-180 (1990)). Other 
unstable or short-lived mRNAs which have been shown to 
25 contain AU-rich sequences within the 3' UTR include 

interferons (alpha, beta and gamma IFNs) ,- interleukins 
(IL1, IL2 and IL3) ; tumor necrosis factor (TNF) ; 
lymphotoxin (Lym) ; IgGl induction factor (IgG IF); 
granulocyte colony stimulating factor (G-CSF) , myb proto- 
oncogene (c-myb) ; and sis proto -oncogene (c-aia) (G. Shaw, 
and R. Kamen, Cell 4£:659-667 (1986)). £g£ also, R. 
Wisdom and W. Lee, Gen. & Devel. 5:232-243 (1991) (c-mycl; 
A. Shyu et al., Gen. & Devel. 5_:221-231 (1991) (c-fos) : T. 
Wilson and R. Treisman, Nature 32£ : 396-399 (1988) (c-fos) ; 
T. Jones and M. Cole, Mol. Cell Biol. 7:4513-4521 (1987) 
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(c-rrQrc); V. Kruys et al., Proc. Natl. Acad. Sci. USA. 
39:673-677 (1992) (INF); D. Koeller et al., Proc. Natl. 
Acad. Sci. USA. 88:7778-7782 (1991) (transferrin receptor 
(TfR) and c-fos) ; I. Laird-Of fringa et al. # Nucleic Acids 
Res. l£:2387-2394 (1991) (c-mvcl : D. Wreschner and G. 
Rechavi, Eur. J. Biochem. 172:333-340 (1988) (which 
contains a survey of genes and relative stabilities) ; 
Bunnell et al.. Somatic Cell and Mol. Genet. 16:151-162 

(1990) (galactosyltransferase-associated protein (GTA) , 
which contains an AU-rich 3' UTR with regions that are 98% 
similar among humans, mice and rats) ; and Caput et al. 
Proc. Natl. Acad. Sci. 83:1670-1674 (1986) (TNF, which 
contains a 33 nt AU-rich sequence conserved i& toto in the 
murine and human TNF mRNAs) . 

Some of these cellular genes which have been 
!5 shown to contain INS within the 3' UTR of their mRNA have 
also been shown to contain INS within the coding region. 
See, e.g., R. Wisdom, and W. Lee, Gen. & Devel. £:232-243 
H (1991) (c-myc) ; A. Shyu et al.. Gen. & Devel. 5:221-231 

(1991) (c-ffis) . 
Like the cellular mRNAs, a number of HIV-1 mRNAs 

have also been shown to contain INS within the protein 
coding regions, which in some cases coincide with areas of 
high AU- content. For example, a 218 nucleotide region 
with high AU content (61.5%) present in the HIV-1 gag 
coding sequence and located at the 5' end of the gag gene 
has been implicated in the inhibition of gag expression. 
S. Schwartz et al., J. Virol. 66:150-159 (1992). Further 
experiments have indicated the presence of more than one 
INS in the gag-protease gene region of the viral genome 
30 ( See below) . Regions of high AU content have been found 
in the HIV-1 gag/pol and env INS regions. The AUUUA 
sequence is not present in the gag coding sequence, but it 
is present in many copies within gag/pol and env coding 
regions. S. Schwartz et al., J. Virol. 66:150-159 (1992). 
See alsfi, e.g., M. Emerman, Cell 57:1155-1165 (1989) (env 
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gene contains both 3' UTR and internal 
inhibitory/instability sequences); c. Rosen, Proc. Natl. 
Acad. Sci., USA 8^:2071-2075 (1988) (env) ; M. 
Hadzopoulou-Cladaras et al., J. Virol. £1:1265-1274 (1989) 
(env); F. Maldarelli et al., J. Virol. £5:5732-5743 (1991) 
(gag/pol); A. Cochrane et al., J. Virol. ££:5303-53i3 
(1991) (pol). F. Maldarelli et al., supra , note that the 
direct analysis of the function of INS regions in the 
context of a replication- competent, full-length HIV-1 
provirus is complicated by the fact that the intragenic 
INS are located in the coding sequences of virion 
structural proteins. They further note that changes in 
these intragenic INS sequences would in most cases affect 
protein sequences as well, which in turn could affect the 
replication of such mutants. 

The INS regions are not necessarily AU-rich. 
For example, the c-ffis coding region INS is structurally 
unrelated to the AU-rich 3' UTR INS (A. Shyu et al., Gen. 
& Devel. S:221-231 (1991), and some parts of the env 
coding region, which appear to contain INS elements, are 
20 not AU-rich. Furthermore, some stable transcripts also 
carry the AUUUA motif in their 3' UTRs, implying either 
that this sequence alone is not sufficient to destabilize 
a transcript, or that these messages also contain a 
dominant stabilizing element (M. Cole and S.E. Mango, 
Enzyme 44:167-180 (1990)). Interestingly, elements unique 
to specific mRNAs have also been found which can stabilize 
a mRNA transcript. One example is the Rev responsive 
element, which in the presence of Rev protein promotes the 
transport, stability and utilization of a mRNA transcript 
(B. Felber et al., Proc. Na.tl. Acad. Sci. USA ££: 1495- 1499 
(1989) ) . 

It is not yet known whether the AU sequences 
themselves, and specifically the Shaw-Kamen sequence, 
AUUUA, act as part or all of the degradation signal. Nor 
is it clear whether this is the only mechanism employed 
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for short-lived messages, or if there are different 
classes of RNAs, each with its own degradative system. 
See, M. Cole and S.E. Mango, Enzyme 44:167-180 (1990) for 
a review; s^e also, T. Jones and M. Cole, Mol. Cell. 
Biol. 7:4513-4521 (1987). Mutation of the only copy of 
the AUUUA sequence in the c- myc RNA INS region has no 
effect on RNA turnover, therefore the inhibitory sequence 
may be quite different from that of GM-CSF (M. Cole and 
S.E. Mango, Enzyme 44:167-180 (1990)), or else the mRNA 
instability may be due to the presence of additional INS 
regions within the mRNA. 

Previous workers have made mutations in genes 
encoding AU-rich inhibitory/ instability sequences within 
the 3' UTR of their transcript mRNAs . For example, G. 
Shaw and R. Kamen, Cell £6:659-667 (1986), introduced a 51 
nucleotide AT- rich sequence from GM-CSF into the 3' UTR of 
the rabbit jS-globin gene. This insertion caused the 
otherwise stable 0-globin mRNA to become highly unstable 
in v±yo, resulting in a dramatic decrease in expression of 
0-globin as compared to the wild- type control. The 
introduction of another sequence of the same length, but 
with 14 G's and C's interspersed among the sequence, into 
the same site of the 3' UTR of the rabbit jS-globin gene 
resulted in accumulation levels which were similar to that 
of wild-type 0-globin mRNA. This control sequence did not 
25 contain the motif AUUUA, which occurs seven times in the 

AU-rich sequence. The results suggested that the presence 
of the AU-rich sequence in the 0-globin mRNA specifically 
confers instability. 

A. Shyu et al., Gen. & Devel. 5:221-231 (1991), 
studied the AU-rich INS in the 3' UTR of c-fos by 
disrupting all three AUUUA pentanucleotides by single U- 
to-A point mutations to preserve the AU- richness of the 
element while altering its sequence. This change in the 
sequence of the 3' UTR INS dramatically inhibited the 
ability of the mutated 3 f UTR to destabilize the jS-globin 
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message when inserted into the 3' UTR of a j8-globin mRNA 
as compared to the wild- type INS. The c- fos protein- 
coding region INS (which is structurally unrelated to the 
3' UTR INS) was studied by inserting it in- frame into the 
coding region of a 0-globin and observing the effect of 
deletions on the stability of the heterologous c-fos-fl- 
globin mRNA. 

Previous workers have also made mutations in 
genes encoding inhibitory/ instability sequences within the 
coding region of their transcript mRNAs. For example, P. 
Carter-Muenchau and R. Wolf, Proc. Natl. Acad. Sci., USA, 
£6:1138-1142 (1989) demonstrated the presence of a 
negative control region that lies deep in the coding 
sequence of the L cpli, 6-phosphogluconate dehydrogenase 
(gnd) gene. The boundaries of the element were defined by 
the cloning of a synthetic "internal complementary 
sequence" (ICS) and observing the effect of this internal 
complementary element on gene expression when placed at 
several sites within the gnd gene. The effect of single 
and double mutations introduced into the synthetic ICS 
element by site-directed mutagenesis on regulation of 
expression of a gnd-lacZ fusion gene correlated with the 
ability of the respective mRNAs to fold into secondary 
structures that sequester the ribosome binding site. 
Thus, the gnd gene's internal regulatory element appears 
25 to function as a cis -acting antisense RNA. 

M. Lundigran et al. f Proc. Natl. Acad. Sci. USA 
fifi:1479-1483 (1991), conducted an experiment to identify 
sequences linked to btuB that are important for its proper 
expression and transcriptional regulation in which a DNA 
fragment carrying the region from -60 to +253 (the coding 
region starts at +241) was mutagenized and then fused in 
frame to lacZ. Expression of 0-galactosidase from variant 
plasmids containing a single base change were then 
analyzed. The mutations were all GC to AT transitions, 
as expected from the mutagenesis procedures used. Among 
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other mutations, a single base substitution at +253 
resulted in greatly increased expression of the btuB-lacZ 
gene fusion under both repressing and nonrepressing 
conditions . 

R. Wisdom and W. Lee, Gen. & Devel. 5:232-243 
5 (1991) , conducted an experiment which showed that mRNA 
derived from a hybrid full length c- myc gene, which 
contains a mutation in the translation initiation codon 
from ATG to ATC, is relatively stable, implying that the 
c-myc coding region inhibitory sequence functions in a 
10 translation dependent manner. 

R. Parker and A. Jacobson, Proc. Natl. Acad. 
Sci. USA 87:2780-2784 (199Q) demonstrated that a region of 
42 nucleotides found in the coding region of Saccharomvces 
ceygv&gtae MATal mRNA, which normally confers low 
15 stability, can be experimentally inactivated by 

introduction of a translation stop codon immediately 
upstream of this 42 nucleotide segment. The experiments 
suggest that the decay of MATal mRNA is promoted by the 
translocation of ribosomes through a specific region of 
20 the coding sequence. This 42 nucleotide segment has a 
high content (8 out of 14) of rare codons (where a rare 
codon is defined by its occurrence fewer than 13 times per 
1000 yeast codons (citing S. Aota et al., Nucl. Acids. 
Res. I£:r315-r402 (1988))) that may induce slowing of 
25 translation elongation. The authors of the study, R. 

Parker and A. Jacobson, state that the concentration of 
rare codons in the sequences required for rapid decay, 
coupled with the prevalence of rare codons in unstable 
yeast mRNAs and the known ability of rare codons to induce 
translational pausing, suggests a model in which mRNA 
structural changes may be affected by the particular 
positioning of a paused ribosome. Another author stated 
that it would be revealing to find out whether (and how) a 
kinetic change in translation elongation could affect mRNA 
35 stability (M. Hentze, Bioch. Biophys. Acta 1090 :281-292 
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(1991)). R. Parker and A. Jacobson, note, however, that 
the stable PGK1 mRNA can be altered to include up to 40% 
rare codons with, at most, a 3 -fold effect on steady- state 
mRNA level and that this difference may actually be due to 
a change in transcription rates. Thus, these authors 
5 conclude, it seems unlikely that ribosome pausing per se 
is sufficient to promote rapid mRNA decay. 

None of the aforementioned references describe 
or suggest the present invention of locating 
inhibitory/ instability sequences within the coding region 
10 of an mRNA and modifying the gene encoding that mRNA to 
remove these inhibitory/instability sequences by making 
multiple nucleotide substitutions without altering the 
coding capacity of the gene, 

15 III. DISCLOSURE OF THE INVENTION 

The invention relates to methods of increasing 
the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/instability 
sequences (INS) in the coding region of the gene which 

20 prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
genes mutated in accordance with these methods and host 
cells containing these constructs. 

As defined herein, an inhibitory/ instability 

25 sequence of a transcript is a regulatory sequence that 
resides within an mRNA transcript and is either (1) 
responsible for rapid turnover of that mRNA and can 
destabilize a second indicator/reporter mRNA when fused to 
that indicator/ reporter mRNA, or is (2) responsible for 

30 underutilization of a mRNA and can cause decreased protein 
production from a second indicator/reporter mRNA when 
fused to that second indicator/ reporter mRNA or (3) both 
of the above. The inhibitory/instability sequence of a 
gene is the gene sequence that encodes an 

35 inhibitory/ instability sequence of a transcript. As used 
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herein, utilization refers to the overall efficiency of 
translation of an mRNA. 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA without changing its protein coding capacity 
However, alternative embodiments of the invention in which 
the inhibitory/instability sequence is mutated in such a 
way that the amino acid sequence of the encoded protein is 
changed to include conservative or non- conservative amino 
acxd substitutions, while still retaining the function of 
the orxginally encoded protein, are also envisioned as 
part of the invention. 

These methods are useful for allowing or 
increasing the expression of genes which would otherwise 
not be expressed or which would be poorly expressed 
because of the presence of INS regions in the mRNA 
transcript. The invention provides methods of increasing 
the production of a protein encoded by a gene which 
encodes an mRNA containing an inhibitory/instability 
region by altering the portion of the nucleotide sequence 
of any gene encoding the inhibitory/instability region. 

The methods, constructs and host cells of the 
invention are useful for increasing the amount of protein 
produced by any gene which encodes an mRNA transcript 
which contains an INS. Examples of such genes include, 
for example, those coding for growth factors, interferons, 
mterleukins, and the fos proto- oncogene protein, as well 
as the genes coding for HIV-i gag and env proteins 

The method of the invention is exemplified by 
the mutational inactivation of an INS within the coding 
regxon of the HIV-i gag gene which results in increased 
gag expression, and by constructs useful for Rev- 
xndependent gag expression in human cells. This 
mutational inactivation of the inhibitory/instability 
sequences involves introducing multiple point mutations 
xnto the AU-rich inhibitory sequences within the coding 
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region of the gag gene which, due to the degeneracy of 
nucleotide coding sequences, do not affect the amino acid 
sequence of the gag protein. 

The constructs of the invention are exemplified 
by vectors containing the gag env, and pol genes which 
have been mutated in accordance with the methods of this 
invention and the host cells are exemplified by human 
HLtat cells containing these vectors. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 
immunoprophylaxis, e.g., as a vaccine, or in genetic 
therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral vectors or 
other expression vectors or they may also be directly 
injected into tissue cells resulting in efficient 
expression of the encoded protein or protein fragment. 
These constructs may also be used for in-vivo or in-vitro 
gene replacement, e.g., by homologous recombination with a 
target gene in- situ. 

The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or further define the boundaries of 
inhibitory/instability sequences in any mRNA which is 
known or suspected to contain such regions, whether the 
INS are within the coding region or in the 3'DTR or both. 
Once the INS regions of the genes have been located and/or 
further defined through the use of these vectors, the same 
vectors can be used in mutagenesis experiments to 
eliminate the identified INS without affecting the coding 
capacity of the gene, thereby allowing an increase in the 
amount of protein produced from expression vectors 
containing these mutated genes. The invention also 
relates to methods of using these constructs and to host 
cells containing these constructs. 

The constructs of the invention which can be 
used to detect instability/inhibitory regions within an 
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mRNA are exemplified by the vectors, pl9, pl7M1234, 
P37M1234 and p37Ml-10D, which are set forth in Fig. 1. (B) 
and Fig. 6. p37M1234 and p37Ml-10D are the preferred 
constructs, due to the existence of a commercially 
available ELISA test which allows the single and rapid 
detection of any changes in the amount of expression of 
the gag indicator/ reporter protein. However, any 
constructs which contain the elements depicted between the 
long terminal repeats in the afore -mentioned constructs of 
Fig. 1. (B) and Fig. 6, and which can be used to detect 
instability/ inhibitory regions within a mRNA, are also 
envisioned as part of this invention. 

The existence of inhibitory/instability 
sequences has been known in the art, but no solution to 
the problem which allowed increased expression of the 
15 genes encoding the mRNAs containing these sequences within 
coding regions by making multiple nucleotide 
substitutions, without altering the coding capacity of the 
gene, has heretofore been disclosed. 

20 IV- BRIEF DES CRIPTION OF THE DRAWINGS 

Fig. 1. (A) Structure of the HIV-1 genome. Boxes indicate 
the different viral genes. (B) Structure of the gag 
expression plasmids (^ee infra ) . Plasmid pl7 contains the 
complete HIV-l 5' LTR and sequences up to the BssHII 
restriction site at nucleotide (nt) 257. (The nucleotide 
numbering refers to the revised nucleotide sequence of the 
HIV-l molecular clone pHXB2 (G. Myers et al., Eds. Human 
retroviruses and AlfrS, — A compilat ion and analysis of 
nucleic acid ami amino acid sequences (Los Alamos National 
Laboratory, Los Alamos, New Mexico, 1991), incorporated 
herein by reference) . This sequence is followed by the 
pl7 w coding sequence spanning nt 336-731 (represented as 
an open box) immediately followed by a translational stop 
codon and a linker sequence. Adjacent to the linker is 
the HIV-l 3' LTR from nt 8561 to the last nucleotide of 
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the U5 region. Plasmid pl7R contains in addition the 330 
nt Styl fragment encompassing the RRE (L. Solomin et al., 
J Virol £4:6010-6017 (1990)) (represented as a stippled 
box) 3' to the pl7«*» coding sequence. The RRE is followed 
by HIV-1 sequences from nt 8021 to the last nucleotide of 
the U5 region of the 3' LTR. Plasmids pl9 and pl9R were 
generated by replacing the HIV-i pi7*« coding sequence in 
plasmids pl7 and pl7R, respectively, with the RSV pl9« 
coding sequence (represented as a black box) . Plasmid 
P17M1234 is identical to pl7, except for the presence of 
28 silent nucleotide substitutions within the gag coding 
region, indicated by XXX. Wavy lines represent plasmid 
sequences. Plasmid pl7ML234 (731-1424) and plasmid 
P37M1234 are described immediately below and in the 
description. These vectors are illustrative of constructs 
which can be used to determine whether a particular 
nucleotide sequence encodes an INS. In this instance, 
vector pl7M1234, which contains an indicator gene (here, 
plW") represents the control vector and vectors 
P17M1234 (731-1424) and p37M1234 represent vectors in which 
the nucleotide sequence of interest (here the p24« coding 
region) is inserted into the vector either 3' to the stop 
codon of the indicator gene or is fused in frame to the 
coding region of the indicator gene, respectively. (C) 
Construction o£ expression vectors for identification of 
gag IMS and for further mutagenesis. pl7M1234 was used as 
a vector to insert additional HIV-i gag sequences 
downstream from the coding region of the altered pl7«*« 
gene. Three different fragments indicated by nucleotide 
numbers were inserted into vector pl7Ml234 as described 
below. To generate plasmids pi7Mi234 (731-1081) , 
P17M1234 (731-1424) and pl7M1234 (731-2165) , the indicated 
fragments were inserted 3' to the stop codon of the pl7 w 
coding sequence in pl7Ml234. In expression assays (data 
not shown) , .pl7M1234 (731-1081) and pl7M1234 (731-1424) 
expressed high levels of pi7« protein. In contrast. 
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p!7M1234 (731-2165) did not express pl7** protein, 
indicating the presence of additional INS within the HIV-i 
gag coding region. To generate plasmids pl7M1234 (731- 
108DNS, p37M1234 and p55M1234, the stop codon at the end 
of the altered pl7 w gene and all linker sequences in 
pl7M1234 were eliminated by oligonucleotide -directed 
mutagenesis and the resulting plasmids restored the gag 
open reading frame as in HIV-l. in expression assays 
(data not shown) p37M1234 expressed high levels of protein 
as determined by western blotting and ELISA assays whereas 
p55M1234 did not express any detectable gag protein. 
Thus, the addition of sequences 3' to the p24 region 
resulted in the elimination of protein expression, 
indicating that nucleotide sequence 1424-2165 contains an 
INS. This experiment demonstrated that p37M1234 is an 
15 appropriate vector to analyze additional INS. 

Pig. 2. Gag expression from the different vectors. (A) 

HLtat cells were transfected with plasmid pl7, pl7R, or 
P17M1234 in the absence (-) or presence (+) of Rev ( see 

20 infia) • The transfected cells were analyzed by 

immunoblotting using a human HIV-l patient serum. (B) 
Plasmid pl9 or pl9R was transfected into HLtat cells in 
the absence (-) or presence (+) of Rev. The transfected 
cells were analyzed by immunoblotting using rabbit and 

25 anti-RSV P19 8 * serum. HIV or RSV proteins served as 

markers in the same gels. The positions of pl7 8ie and plS^ 
are indicated at right. 
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Pig. 3. mRNA analysis on northern blots. (A) HLtat cells 
were transfected with the indicated plasmids in the 
absence (-) or presence (+) of Rev. 20 fig of total RNA 
prepared from the transfected cells were analyzed ( see 
inf^a) . (B) RNA production from plasmid pl9 or pl9R was 
similarly analyzed in the absence (-) or presence (+) of 
Rev. 
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Pig. 4. Nucleotide sequence of the HIV-1 pl7*"» region. 

The locations of the 4 oligonucleotides (M1-M4) used to 
generate all mutants are underlined. The silent 
nucleotide substitutions introduced by each mutagenesis 
oligonucleotide are indicated below the coding sequence. 
Numbering starts from nt +1 of the viral mRNA. 

Pig. 5. Gag expression by different mutants. HLtat cells 
were transfected with the various plasmids indicated at 
the top of the figure. Plasmid pl7R was transfected in 
the absence (-) or presence (+) of Rev, while the other 
plasmids were analyzed in the absence of Rev. pi7*"» 
production was assayed by immunoblotting as described in 
Fig. 2. 

15 pig. 6. Expression vectors used in the identification 
and elimination of additional INS elements in the gag 
region. The gag and pol region nucleotides included in 
each vector are indicated by lines. The position of some 
gag and pol oligonucleotides is indicated at the top of 
the figure, as are the coding regions for pl7 J, », p24**», 
pl5**«, protease and p66 po ' proteins. Vector p37Ml234 was 
further mutagenized using different combinations of 
oligonucleotides. One obtained mutant gave high levels of 
p24 after expression. It was analyzed by sequencing and 
found to contain four mutant oligonucleotides M6gag, 
M7gag, M8gag and MlOgag. Other mutants containing 
different combinations of oligos did not show an increase 
in expression, or only partial increase in expression. 
P55BM1-10 and p55AMl-10 were derived from p37Ml-10D. 
P55M1-13P0 contains additional mutations in the gag and 
pol regions included in the oligonucleotides Mllgag, 
M12gag, Ml3gag and MOpol. The hatched boxes indicate the 
location of the mutant oligonucleotides; the hatched boxes 
containing circles indicate mutated regions containing 
ATTTA sequences, which may contribute to instability 
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and/or inhibition of the mRNA; and the open boxes 
containing triangles indicate imitated regions containing 
AATAAA sequences, which may contribute to instability 
and/or inhibition of the mRNA. Typical levels of p24 w 
expression in human cells after transf ections as described 
5 supra are shown at the right (in pg/ml) . 

Fig, 7. Eukaryotic expression plasmids used to study env 
expression. The different expression plasmids are derived 
from pNL15E (Schwartz, et al. J. Virol. 64:5448-5456 
10 (1990) . The generation of the different constructs is 
described in the text. The numbering follows the 
corrected HXB2 sequence (Myers et al., 1991, supra : Ratner 
et al., Hamatol. Bluttransf us . 31:404-406 (1987); Ratner 
et al., AIDS Res. Hum. Retroviruses 3:57-69 (1987); 
Solomin, et al- J. Virol. 64:6010-6017 (1990), starting 
with the first nucleotide of R as +1. 5'SS, 5 1 splice 
site; 3'SS, 3' splice site. 
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Pig. 8. Env expression is Rev dependent in the absence of 
functional splice sites. Plasmids plBESD- and plSEDSS (C) 
were transfected in the absence or presence, of a rev 
expression plasmid (pL3crev) into HLtat cells. One day 
later, the cells were harvested for analyses of RNA and 
protein. Total RNA was extracted and analyzed on Northern 
25 blots (B) . The blots were hybridized with a 

nick- translated probe spanning Xhol-Sacl (nt 8443 to 9118) 
of HXB2. Protein production was measured by western blots 
to detect cell -associated Env using a mixture of HIV-l 
patient sera and rabbit anti-gpi20 antibody (A) . 
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Fig. 9. Env production from the gpl20 expression plasmids. 

The indicated plasmids were transfected into HLtat cells 
in duplicate plates. A rev expression plasmid (pL3srev) 
was cotransfected as indicated. One day later, the cells 
were harvested for analyses of RNA and protein. Total RNA 
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was extracted and analyzed on Northern blots (A) . The 
blots were hybridized using a nick- translated probe 
spanning nt 6158 to 7924. Protein production (B) was 
measured by immunoprecipitation after labeling for 5 h 
with 200 mCi/ml of J5 S- cysteine to detect secreted 
processed Env (gpl20) . 

Pig. 10. The identification of INS elements within gpl20 
and gp41 using the pl9 (RSV gag) test system. Schematic 
structure of exon 5B containing the env ORF. Different 
fragments (A to 6) of the gp4l portion and fragment H of 
the vpu/gpl20 portion were PCR amplified and inserted into 
the unique EcoRI site located downstream of the RSV gag 
gene in pl9. The location of the sequences included in 
the amplified fragments is indicated to the right using 
15 HXB2R numbering system. Fragments A and B are amplified 
from pNL15B and pNLlSEDSS (in which the splice acceptor 
sites 7A, 7B and 7 have been deleted) respectively, using 
the same oligonucleotide primers. They are 276 and 234 
nucleotides long, respectively. Fragment C was amplified 
from pNLlSEDSS as a 323 nucleotide fragment. Fragment F 
is a Hpal-Kpnl restriction fragment of 362 nucleotides. 
Fragment E was amplified as a 668 nucleotide fragment from 
pNLlSEDSS, therefore the major splice donor at nucleotide 
5592 of HXB2 has been deleted. The rest of the fragments 
were amplified from pNL15E as indicated in the figure. 
HLtat cells were transfected with these constructs. One 
day later, the cells were harvested and pl9gag production 
was determined by Western blot analysis using the 
anti-RSVGag antibody. The expression of Gag from these 
plasmids was compared to Gag production of pi9. SA, splice 
acceptor; B, BamHI; H, Hpal; x, Xhol; K, KpnI. The down 
regulatory effect of INS contained within the different 
fragments is indicated at right. 
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Fig. 11. The identification of IHS elements within gp!20 
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and gp41 using the p37Ml-10D (mutant INS p37«"» expression 
system) test system. Schematic structure of the env ORF. 

Different fragments (1 to 7) of env were PGR amplified as 
indicated in the figure and inserted into the polylinker 
located downstream of the p37 mutant gag gene in 
P37M1-10D. Fragments 1 to 6 were amplified from the 
molecular clone pLW2.4, a gift of Dr. M. Reitz, which is 
very similax to HXB2R. Clone pLW2.4 was derived from an 
individual infected by the same HIV-i strain IIIB, from 
which the HXB2R molecular clone has been derived. 
Fragment 7 was cloned from pNL43. For consistency and 
clarity, the numbering follows the HXB2R system. HLtat 
cells were transfected with these constructs. One day 
later, the cells were harvested and p24« production was 
determined by antigen capture assay. The expression of 
Gag from these plasmids was compared to Gag production of 
p37Ml-10D. The down regulatory effect of each fragment is 
indicated at right. 

Fig. 12. Eli m i n ation of the negative effects of CRS in 
the pol region. Nucleotides 3700-4194 of HIV-l were 
inserted in vector p37M1234 as indicated. This resulted 
in the inhibition of gag expression. Using mutant 
oligonucleotides M9pol-M12pol (P9-P12) , several mutated 
CRS .clones were isolated and characterized. One of them, 
p37M1234RCRSP10+P12p contains the mutations indicated in 
Fig. 13. This clone produced high levels of gag. 
Therefore, the combination of mutations in 
p37M1234RCRSP10+P12p eliminated the INS, while mutations 
only in the region of P10 or of P12 did not eliminate the 
INS. 

Fig. 13. Point mutations eliminating the negative effects 
of CRS in the pol region {nucleotides 3700-4194) . The 

combination of mutations able to completely inactivate the 
inhibitory/instability element within the CRS region of 
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HIV-1 pol (nucleotides 3700-4194) is shown under the 
sequence in small letters. These mutations are contained 
within oligonucleotides MIOpol and M12pol (see Table 2) . 
Ml2pol oligonucleotide contains additional mutations that 
were not introduced into p37M1234RCRSP10+Pl2p (see Pig. 
12) , as determined by DNA sequencing. 

Pig. 14. Plasmid map and nucleotide sequence of the 
efficient gag expression vector p37Ml-10D. (A) Plasmid 
map of vector p37Ml-10D. The plasmid contains a 
pBluescriptKS(-) backbone, human genomic sequences 
flanking the HIV-l sequences as found in pNL43 genomic 
clone, HIV-l LTRs and the p37«« region (pi7 and p24) . The 
pl7 region has been mutagenized using oligonucleotides Ml 
to M4, and the p24 region has been mutagenized using 
oligonucleotides M6, M7, M8 and M10, as described in the 
test. The coding region for p37 is flanked by the 5' and 
3 HIV-l LTRs, which provide promoter and polyadenylation 
signals, as indicated by the arrows. Three consecutive 
arrows indicate the U5, R, and U3 regions of the LTR, 
respectively. The transcribed portions of the LTRs are 
shown in black.. The translational stop codon inserted at 
the end of the p24 coding region is indicated at position 
1818. Some restriction endonuclease cleavage sites are 
^ also indicated. (B-D) Complete nucleotide sequence of 
P37M1-10D. The amino acid sequence of- the p37«« protein 
is shown under the coding region. Symbols are as above. 
Numbering starts at the first nucleotide of the 5' LTR. 
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It is to be understood that both the foregoing 
general description and the following detailed description 
are exemplary and explanatory only, and are not 
restrictive of the invention, as claimed. The 
accompanying drawings, which are incorporated in and 
constitute a part of the specification, illustrate an 
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embodiment of the invention and, together with the 
description, serve to explain the principles of the 
invention. * 

The invention comprises methods for eliminating 
intragenic inhibitory/instability regions of an mRNA by 
(a) identifying the intragenic inhibitory/instability 
regions, and (b) mutating the intragenic 
iniiibitory/instability regions by making multiple point 
mutations. These mutations may be clustered. This method 
does not require the identification of the exact location 
or knowledge of the mechanism of function of the INS, 
Nonetheless, the results set forth herein allow the 
conclusion that multiple regions within mRNAs participate 
in determining stability and utilization and that many of 
. these elements act at the level of RNA transport, 
15 turnover, and/or localization. Generally, the mutations 
are such that the amino acid sequence encoded by the mRNA 
is unchanged, although conservative and non- conservative 
amino acid substitutions are also envisioned as part of 
the invention where the protein encoded by the mutated 
gene is substantially similar to the protein encoded by 
the non -imitated gene. 

The nucleotides to be altered can be chosen 
randomly, the only requirement being that the amino acid 
sequence encoded by the protein remain unchanged; or, if 
25 conservative and non- conservative amino acid substitutions 
are to be made, the only requirement is that the protein 
encoded by the mutated gene be substantially similar to 
the protein encoded by the non-mutated gene. 

If the INS region is AT rich or GC rich, it is 
preferable that it be altered so that it has a content of 
about 50% G and C and about 50% A and T. If the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more- preferred codons. If 
desired, however (e.g., to make an A and T rich region 
more G and C rich) , more-preferred codons can be altered 
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to less -preferred codons. If the INS region contains 
conserved nucleotides, some of those conserved nucleotides 
could be altered to non- conserved nucleotides. Again, the 
only requirement is that the amino acid sequence encoded 
by the protein remain unchanged; or, if conservative and 
non- conservative amino acid substitutions are to be made, 
the only requirement is that the protein encoded by the 
mutated gene be substantially similar to the protein 
encoded by the non-mutated gene. 

As used herein, conserved nucleotides means 
evolutionarily conserved nucleotides for a given gene, 
since this conservation may reflect the fact that they are 
part of a signal involved in the inhibitory/instability 
determination. Conserved nucleotides can generally be 
determined from published references about the gene of 
interest or can be determined by using a variety of 
computer programs available to practitioners of the art. 

Less -preferred and more-preferred codons for 
various organisms can be determined from codon usage 
charts, such as those set forth in T. Maruyama et al., 
20 Nucl. Acids Res. I4:rl51-rl97 (1986) and in S. Aota et 

al., Nucl. Acids. Res. 16:r315-r402 (1988), or through use 
of a computer program, such as that disclosed in U.S. 
Patent No. 5,082,767 entitled "Codon Pair Utilization", 
issued to G. W. Hatfield et al. on January 21, 1992, which 
is incorporated herein by reference. 

Generally, the method of the invention is 
carried out as follows: 
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x - Identification of an mRNA containing an INS 
The rate at which a particular protein is made 
is usually proportional to the cytoplasmic level of the 
mRNA which encodes it. Thus, a candidate for an mRNA 
containing an inhibitory/instability sequence is one whose 
mRNA or protein is either not detectably expressed or is 
expressed poorly as compared to the level of expression of 
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a reference mRNA or protein under the control of the same 
or similar strength promoter. Differences in the steady 
state levels of a particular mRNA (as determined, for 
example, by Northern blotting) , when compared to the 
steady state level of mRNA. from another gene under the 
control of the same or similar strength promoter, which 
cannot be accounted for by changes in the apparent rate of 
transcription (as determined, for example, by nuclear run- 
on assays) indicate that the gene is a candidate for an 
unstable mRNA. In addition or as an alternative to being 
unstable, cytoplasmic mRNAs may be poorly utilized due to 
various inhibitory mechanisms acting in the cytoplasm. 
These effects may be mediated by specific mRNA sequences 
which are named herein as "inhibitory sequences". 

Candidate mRNAs containing 
inhibitory/instability regions include mRNAs from genes 
whose expression is tightly regulated, e.g., many 
oncogenes, growth factor genes and genes for biological 
response modifiers such as interleukins . Many of these 
genes are expressed at very low levels, decay rapidly and 
are modulated quickly and transiently under different 
conditions. The negative regulation of expression at the 
level of mRNA stability and utilization has been 
documented in several cases and has been proposed to be 
occurring in many other cases. Examples of genes for 
which there is evidence for post -transcriptional 
regulation due to the presence of inhibitory/instability 
regions in the mRNA include the cellular genes encoding 
granulocyte -monocyte colony stimulating factor (GM-CSF) , 
proto- oncogenes c-nQrc, c-nafe, c-sis, c-fos: interferons 
(alpha, beta and gamma IFNs) ; interleukins (IL1, IL2 and 
IL3); tumor necrosis factor (TNF) ; lymphotoxin (Lym) ; IgGl 
induction factor (IgG IF) ; granulocyte colony stimulating 
factor (G-CSF); transferrin receptor . (TfR) ; and 
galactosyltransferase-associated protein (GTA) ; HIV-1 
genes encoding env, gag and pol; the coli genes for 6- 
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phosphogluconate dehydrogenase (gnd) and btuB; and the 
yeast gene for MATal (age the discussion in the 
-Background Art" section, above). The genes encoding the 
cellular proto- oncogenes c-my£ and c-£as, as well as the 
yeast gene for MATal and the HIV-i genes for gag, env and 
pol are genes for which there is evidence for 
inhibitory/instability regions within the coding region in 
addition to evidence for inhibitory/ instability regions 
within the non- coding region. Genes encoding or suspected 
of encoding mRNAs containing inhibitory/instability 
regions within the coding region are. particularly relevant 
to the invention. 

After identifying a candidate unstable or poorly 
utilized mRNA, the in vivo half -life {or stability) of 
that mRNA can be studied by conducting pulse- chase 
!5 experiments (i.e., labeling newly synthesized RNAs with a 
radioactive precursor and monitoring the decay of the 
radiolabeled mRNA in the absence of label) ; or by 
introducing Is vit^o. transcribed mRNA into target cells 
(either by microinjection, calcium phosphate co- 
precipitation, electroporation, or other methods known in 
the art) to monitor the in vivo half -life of the defined 
mRNA population; or by expressing the mRNA under study 
from a promoter which can be induced and which shuts off 
transcription soon after induction, and estimating the 
half -life of the mRNA which was synthesized during this 
short transcriptional burst; or by blocking transcription 
pharmacologically (e.g., with Actinomycin D) and following 
the decay of the particular mRNA at various time points 
after the addition of the drug by Northern blotting or RNA 
protection (e.g. Si nuclease) assays. Methods for all the 
above determinations are well established. Ssg, e.g., 
M.W. Hentze et al., Biochim. Biophys. Acta Ifl20_:281-292 
(1991) and references cited therein. §g§. also . 
S. Schwartz et al., J. Virol. 66:150-159 (1992). The most 
useful measurement is how much protein is produced, 
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because this includes all possible INS mechanisms. 
Examples of various mRNAs which have been shown to contain 
or which are suspected to contain INS regions are 
described above. Some of these mRNAs have been shown to 
have half- lives of less than 30 minutes when their mRNA 
levels are measured by Northern blots (see, e.g., D. 
Wreschner and G. Rechavi, Eur. J. Biochem. 172:333-340 
(1988)). 

2 - Localization of Instability Determinant* 
When an tins table or poorly utilized mRNA has 
been identified, the next step is to search for the 
responsible (cis-acting) RNA sequence elements. Detailed 
methods for localizing the cis -acting 

inhibitory/instability regions are set forth in each of 
the references described in the "Background Art" section, 
above, and are also discussed infra . The exemplified 
constructs of the present invention can also be used to 
localize INS (see below) . £is acting sequences 
responsible for specific mRNA turnover can be identified 
by deletion and point mutagenesis as well as by the 
occasional identification of naturally occurring mutants 
with an altered mRNA stability. 

In short, to evaluate whether putative 
regulatory sequences are sufficient to confer mRNA 
stability control, DNA sequences coding for the suspected 
INS regions are fused to an indicator (or reporter) gene 
to create a gene coding for a hybrid mRNA. The DNA 
sequences fused to the indicator (or reporter) gene can be 
cDNA, genomic DNA or synthesized DNA. Examples of 
indicator (or reporter) genes that are described in the 
references set forth in the "Background Art" section 
include the genes for neomycin, j8-galactosidase, 
chloramphenicol actetyltransf erase (CAT) , and lucif erase, 
as well as the genes for j8-globin, PGK1 and ACT1. See 
also Sambrook et al.. Molecular Cloning, a Laboratory 
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Manual. 2d. ed. Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, (1989), pp. 16.56-16.67. Other genes 
which can be used as indicator genes are disclosed herein 
(i.e., the gag gene of the Rous Sarcoma Virus (which lacks 
an inhibitory/instability region) and the Rev independent 
HTV-l gag genes of constructs pl7M1234, p37M1234 and 
P37M1-10D, which have been mutated to inactivate the 
inhibitory/instability region and which constitute one 
aspect of the invention, in general, virtually any gene 
encoding a mRNA which is stable or which is expressed at 
relatively high levels (defined here as being stable 
enough or expressed at high enough level so that any 
decrease in the level of the mRNA or expressed protein can 
be detected by standard methods) can be used as an 
indicator or reporter gene, although the constructs 
P37M1234 and P 37M1-10D, which are exemplified herein, are 
preferred for reasons set forth below. Preferred methods 
of creating hybrid genes using these constructs and 
testing the expression of mRNA and protein from these 
constructs are also set forth below. 

In general, the stability and/or utilization of 
the mRNAs generated by the indicator gene and the hybrid 
genes consisting of the indicator gene fused to the 
sequences suspected of encoding an INS region are tested 
by transfecting the hybrid genes into host cells which are 
appropriate for the expression vector used to clone and 
express the mRNAs. The resulting levels of mRNA are 
determined by standard methods of determining mRNA 
stability, e.g. Northern blots. Si mapping or PCR methods, 
and the resulting levels of protein produced are 
quantitated by protein measuring assays, such as ELISA, 
immunoprecipitation and/or western blots. The 
inhibitory/ instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. Note that if the 
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ultimate goal is to increase production of the encoded 
protein, the identification of the INS is most preferably 
carried out in the same host cell as will be used for the 
production of the protein. 

Examples of some of the host cells that have 
5 been used to detect INS sequences include somatic 

mammalian cells, Xenopus oocytes, yeast and E. coli . See , 
e.g., G. Shaw and R. Kamen, Cell £6:659-667 (1986) 
(discussed pjipya) which localized instability sequences in 
GM-CSF by inserting putative inhibitory sequences into the 
10 3' UTR of the 0-globin gene, causing the otherwise stable 
0-globin mRNA to become unstable when transfected into 
mouse or human cells. See also I. Laird-Off ringa et al., 
Nucleic Acids Res. 12:2387-2394 (1991) which localized 
inhibitory/ instability sequences in c-mvc using hybrid c- 
15 mjrg- neomycin resistance genes introduced into rat 

fibroblasts, and M. Lundigran et al., Proc. Natl. Acad. 
Sci. USA 11:1479-1483 (1991) which localized 
inhibitory/instability sequences in btuB gene by using 
hybrid btuB-lacZ genes introduced into JL. coli. For 
examples of reported localization of specific 
inhibitory/ instability sequences within a transcript of 
HIV-1 by destabilization of an otherwise long-lived 
indicator transcript, see, e.g., M. Emerman, Cell 57:1155- 
1165 (1989) (replaced 3' UTR of env gene with part of HBV 
and introduced into COS-1 cells); S. Schwartz et al., J. 
Virol. £6:150-159 (1992) (gag gene fusions with Rev 
independent tat reporter gene introduced into HeLa cells); 
F* Maldarelli et al., J. Virol. 65:5732-5743 (1991) 
(gag/pol gene fusions with Rev independent tat reporter 
gene or chloramphenicol acetyl transferase (CAT) gene 
introduced into HeLa and SW480 cells) ; and A. Cochrane et 
al., J. Virol. 65:5303-5313 (1991) (pol gene fusions with 
CAT gene or rat proinsulin gene introduced into COS-1 and 
CHO cells) . 

It is anticipated that in vitro mRNA degradation 
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systems (e.g., crude cytoplasmic extracts) to assay mRNA 
turnover in vitro will complement ongoing in vivo analyses 
* and help to circumvent some of the limitations of the in 
vivo systems. Sg£ M.W. Hentze et al., Biochim. Biophys. 
Acta 1020:281-292 (1991) and references cited therein. 
See jilso D. Wreschner and G. Rechavi, Eur. J. Biochem. 
172:333-340 (1988), which analyzed exogenous mRNA 
stability in a reticulocyte lysate cell -free system. 

In the method of the invention, the whole gene 
of interest may be fused to an indicator or reporter gene 
and tested for its effect on the resulting hybrid mRNA in 
order to determine whether that gene contains an 
inhibitory/ instability region or regions. To further 
localize the INS within the gene of interest, fragments of 
the gene of interest may be prepared by sequentially 
deleting sequences from the gene of interest from either 
the 5' or 3' ends or both. The gene of interest may also 
be separated into overlapping fragments by methods known 
in the art (e.g., with restriction endonucleases , etc.) 
3£e, e.g., S. Schwartz et al., J. Virol. 66:150-159 
20 (1992) . Preferably, the gene is separated into 

overlapping fragments about 300 to 2000 nucleotides in 
length. Two types of vector constructs can be made. To 
permit the detection of inhibitory/instability regions 
that do not need to be translated in order to function, 
vectors can be constructed in which the gene of interest 
(or its fragments or suspected INS) can be inserted into 
the 3' UTR downstream from the stop codon of an indicator 
or reporter gene. This does not permit translation 
through the INS . To test the possibility that some 
inhibitory/instability sequences may act only after 
translation of the mRNA, vectors can be constructed in 
which the gene of interest (or its fragments or suspected 
INS) is inserted into the coding region of the 
indicator/reporter gene. This method will permit the 
detection of inhibitory/instability regions that do need 
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to be translated in order to function. The hybrid 
constructs are transfected into host cells, and the 
resulting mRNA levels are determined by standard methods 
of determining mRNA stability, e.g. Northern blots. Si 
mapping or PCR methods, as set forth above and as 
described in most of the references cited in the 
"Background Art- section. See also, Sambrook et al. 
(1989), supra , for experimental methods. The protein 
produced from such genes is also easily quantitated by 
existing assays, such as ELISAS, immunoprecipitation and 
western blots, which are also described in most of the 
references cited in the "Background Art" section. Sjse. 
also., Sambrook et al. (1989), sujara, for experimental 
methods. The hybrid DNAs containing the 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. The use of 
various fragments of the gene permits the identification 
of multiple independently functional 

inhibitory/instability regions, if any, while the use of 
overlapping fragments lessen the possibility that an 
inhibitory/instability region will not be identified as a 
result of its being cut in half, for example. 

The exemplified test vectors set forth in Fig. 
1. (B) and Fig. 6 and described herein, e.g., vectors 
P17M1234, P37M1234, P37M1-10D and p!9, can be used to 
assay for the presence and location of INS in various 
RNAs, including INS which are located within coding 
regions. These vectors can also be used to determine 
whether a gene of interest not yet characterized has INS 
which are candidates for mutagenesis curing. These 
vectors have a particular advantage over the prior art in 
that the same vectors can be used in the mutagenesis step 
of the invention (described below) in which the identified 
INS is eliminated without affecting the coding capacity of 



WO 93/20212 



PCT/US93/02908 



10 



- 29 - 

the gene. 

The method of using these vectors involves 
introducing the entire gene, entire cDNA or fragments of 
the gene ranging from approximately 300 nucleotides to 
approximately 2 kilobases 3' to the coding region for gag 
protein using unique restriction sites which are 
engineered into the vectors. The expression of the gag 
gene in HLtat cells is measured at both the RNA and 
protein levels, and compared to the expression of the 
starting vectors. A decrease in expression indicates the 
presence of INS candidates that may be cured by 
mutagenesis. The method of using the vectors exemplified 
in Fig. i herein involves introducing the entire gene and 
fragments of the gene of interest into vectors pl7Ml234, 
P37M1234 and pl9. The size of the fragments are 
15 preferably 300-2000 nucleotides long. Plasmid DNA is 
prepared in a. soli and purified by the CsCl method. 

To permit detection of inhibitory/instability 
regions which do not need to be translated in order to 
function, the entire gene and fragments of the gene of 
interest are introduced into vectors pl7Ml234, p37Ml234 or 
pl9 3' of the stop codon of the pl7" coding region. To 
allow the detection of inhibitory/instability regions that 
affect expression only when translated, the described 
vectors can be manipulated so that the coding region of 
the entire gene or fragments of the gene of interest are 
fused in frame to the expressed gag protein gene. For 
example, a fragment containing all or part of the coding 
region of the gene of interest can be inserted exactly 3' 
to the termination codon of the gag coding sequence in 
vector p37M1234 and the termination codon of gag and the 
linker sequences can be removed by oligonucleotide 
mutagenesis in such a way as to fuse the gag reading frame 
to the reading frame of the gene of interest. 

RNA and protein production from the two 
expression vectors (e.g. p37M1234 containing the fragment 
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of the gene of interest inserted directly 3' of the stop 
codon of the gag coding region, with the gag termination 
codon intact, and p37M1234 containing the fragment of the 
gene of interest inserted in frame with the gag coding 
region, with the gag termination codon deleted) are then 
compared after transf ection of purified DNA into HLtat 
cells.. 

The expression of these vectors after 
transfection into human cells is monitored at both the 
level of RNA and protein production. RNA levels are 
quantitated by, e.g.. Northern blots, SI mapping or PCR 
methods. Protein levels are quantitated by, e.g., western 
blot or ELISA methods. p37M1234 and p37Ml-l0D are ideal 
for quantitative analysis because a fast non- radioactive 
ELISA protocol can be used to detect gag protein (DUPONT 
15 or COULTER gag antigen capture assay) . A decrease in the 
level of expression of the gag antigen indicates the 
presence of inhibitory/instability regions within the 
cloned gene or fragment of the gene of interest. 

After the inhibitory/instability regions have 
been identified, the vectors containing the appropriate 
INS fragments can be used. to prepare single- stranded DNA 
and then used in mutagenesis experiments with specific 
chemically synthesized oligonucleotides in the clustered 
mutagenesis protocol described below. 
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3. Mutation of the Inhibitory/Instability 
Regions to Genera te Stable mRNAg 

Once the inhibitory/instability sequences are 
located within the coding region of an mRNA, the gene is 
modified to remove these inhibitory/instability sequences 
without altering the coding capacity of the gene. 
Alternatively, the gene is modified to remove the 
inhibitory/ instability sequences, simultaneously altering 
the coding capacity of the gene to encode either 
conservative or non- conservative amino acid substitutions. 
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In the method of the invention, the most general 
method of eliminating the INS in the coding region of the 
gene of interest is by making multiple mutations in the 
INS region of the gene or gene fragments, without changing 
the amino acid sequence of the protein encoded by the 
gene; or, if conservative and non- conservative amino acid 
substitutions, are to be made, the only requirement is that 
the protein encoded by the mutated gene be substantially 
similar to the protein encoded by the non-mutated gene. 
It is preferred that all of the suspected 
inhibitory/instability regions, if more than one, be 
mutated at once. Later, if desired, each 
inhibitory/instability region can be mutated separately in 
order to determine the smallest region of the gene that 
needs to be mutated in order to generate a stable mRNA. 
The ability to mutagenize long DNA regions at the same 
time can decrease the time and effort needed to produce, 
the desired stable and/or highly expressed mRNA and 
resulting protein. The altered gene or gene fragments 
containing these mutations will then be tested in the 
usual manner, as described above, e.g., by fusing the 
altered gene or gene fragment with a reporter or indicator 
gene and analyzing the level of mRNA and protein produced 
by the altered genes after transfection into an 
appropriate host cell. If the level of mRNA and protein 
produced by the hybrid gene containing the altered gene or 
gene fragment is about the same as that produced by the 
control construct encoding only the indicator gene, then 
the inhibitory/instability regions have been effectively 
eliminated from the gene or gene fragment due to the 
alterations made in the INS. 

In the method of the invention, more than two 
point mutations will be made in the INS region. 
Optionally, point mutations, may be made in at least about 
10% of the nucleotides in the inhibitory/instability 
region. These point mutations may also be clustered. The 
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nucleotides to be altered can be chosen randomly (i.e., 
not chosen because of AT or GC content or the presence or 
absence of rare or preferred codons) , the only requirement 
being that the amino acid sequence encoded by the protein 
remain unchanged; or, if conservative and non- conservative 
amino acid substitutions are to be made, the only 
requirement is that the protein encoded by the mutated 
gene be substantially similar to the protein encoded by 
the non-mutated gene. 

In the method of the present invention, the gene 
sequence can be mutated so that the encoded protein 
remains the same due to the fact that the genetic code is 
degenerate, i.e., many of the amino acids may be encoded 
by more than one codon. The base code for serine, for 
example, is six-way degenerate such that the codons TCT, 
15 TCG, TCC f TCA, AST, and AGC all code for serine. 

Similarly, threonine is encoded by any one of codons ACT, 
ACA, ACC and ACG. Thus, a plurality of different DNA 
sequences can be used to code for a particular set of 
amino acids. The codons encoding the other amino acids 
are TTT and TTC for phenylalanine; TTA, TTG, CTT, CTC, CTA 
and CTG for leucine; ATT, ATC and ATA for isoleucine; ATG 
for methione; GTT, GTC, GTA and GTG for valine; CCT, CCC, 
CCA and CCG for proline; GOT, GCC, GCA and GCG for 
alanine; TAT and TAC for tyrosine; CAT and CAC for 
histidine; CAA and CAG for glutamine; AAT and AAC for 
asparagihe; AAA and AAG for lysine; GAT and GAC for 
aspartic acid; GAA and GAG for glutamic acid; TGT and TGC 
for cysteine; TGG for tryptophan; CGT, CGC, CGA and CGG 
for arginine; and GGU, GGC, GGA and GGG for glycine. 
Charts depicting the codons (i.e., the genetic code) can 
be found in various general biology or biochemistry 
textbooks . 

In the method of the present invention, if the 
portion (s) of the gene encoding the inhibitory/instability 
regions are AT- rich, it is preferred, but not believed to 
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be necessary, that most or all of the mutations in the 
inhibitory/instability region be the replacement of A and 
T with G and C nucleotides, making the regions more GC- 
rich, while still maintaining the coding capacity of the 
gene. If the portion (s) of the gene encoding the 
inhibitory/ instability regions are GC-rich, it is 
preferred, but not believed to be necessary, that most or 
all of the mutations in the inhibitory/ instability region 
be the replacement of G and C nucleotides with A and T 
nucleotides, making the regions less GC-rich, while still 
maintaining the coding capacity of the gene, if the INS 
region is either AT- rich or GC-rich, it is most preferred 
that it be altered so that it has a content of about 50% G 
and C and about 50% A and T. The AT- (or AU-) content 
(or, alternatively, the GC-content) of an 
inhibitory/instability region or regions can be calculated 
by using a computer program designed to make such 
calculations. Examples of such programs, used to 
determine the AT- richness of the HIV-1 gag 
inhibitory/instability regions exemplified herein, are the 
GCG Analysis Package for the VAX (University of Wisconsin) 
and the Gene Works Package (Intelligenetics) . 

In the method of the invention, if the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. If 
desired, however (e.g., to make an AT-rich region more GC- 
rich) , more-preferred codons can be altered to less- 
preferred codons. it is also preferred, but not believed 
to be necessary, that less -preferred or rarely used codons 
be replaced with more -preferred codons. Optionally, only 
the most rarely used codons (identified from published 
codon usage tables, such as in T. Maruyama et al., Nucl. 
Acids Res. M(Supp) :rl51-l97 (1986)) can be replaced with 
preferred codons, or alternatively, most or all of the 
rare codons can be replaced with preferred codons. 
Generally, the choice of preferred codons to use will 
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depend on the codon usage of the host cell in which the 
altered gene is to be expressed. Note, however, that the 
substitution of more -preferred codons with less -preferred 
codons is also functional, as shown in the example below. 

As noted above, coding sequences are chosen on 
5 the basis of the genetic code and, preferably on the 
preferred codon usage in the host cell or organism in 
which the mutated gene of this invention is to be 
expressed. In a number of cases the preferred codon usage 
of a particular host or expression system can be 
10 ascertained from available references ( see , e.g., T. 
Maruyama et al., Nucl. Acids Res. 14 (Supp) :rl51-197 
(1986)), or can be ascertained by other methods ( see , 
e.g., U.S. Patent No. 5,082,767 entitled "Codon Pair 
Utilization" , issued to G. W. Hatfield et al. on January 
15 21, 1992, which is incorporated herein by reference) . 
Preferably, sequences will be chosen to optimize 
transcription and translation as well as mRNA stability so 
as to ultimately increase the amount of protein produced. 
Selection of codons is thus, for example, guided by the 
preferred use of codons by the host cell and/or the need 
to provide for desired restriction endonuclease sites and 
could also be guided by a desire to avoid potential 
secondary structure constraints in the encoded mRNA 
transcript. Potential secondary structure constraints can 
25 be identified by the use of computer programs such as the 
one described in M. Zucker et al., Nucl. Acids Res. 9:133 
(1981) . More than one coding sequence may be chosen in 
situations where the codon preference is unknown or 
ambiguous for optimum codon usage in the chosen host cell 
or organism. However, any correct set of codons would 
encode the desired protein, even if translated with less 
than optimum efficiency. 

In the method of the invention, if the INS 
region contains conserved nucleotides, it is also 
preferred, but not believed to be necessary, that 
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conserved nucleotides sequences in the 

inhibitory/instability region be mutated. Optionally, at 
least approximately 75% of the mutations made in the 
inhibitory/instability region may involve the mutation of 
conserved nucleotides. Conserved nucleotides can be 
determined by using a variety of computer programs 
available to practitioners of the art. 

In the method of the invention, it is also 
anticipated that inhibitory/instability sequences can be 
mutated such that the encoded amino acids are changed to 
contain one or more conservative or non- conservative amine 
acids yet still provide for a functionally equivalent 
protein. For example, one or more amino acid residues 
within the sequence can be substituted by another amino 
acid of a similar polarity which acts as a functional 
equivalent, resulting in a neutral substitution in the 
amino acid sequence. Substitutes for an amino acid within 
the sequence may be selected from other members of the 
class to which the amino acid belongs. For example, the 
nonpolar (hydrophobic) amino acids include alanine, 
leucine, isoleucine, valine, proline, phenylalanine, 
tryptophan and methionine. The polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine. The positively charged (basic) 
amino acids include arginine, lysine and histidine. The 
negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. 

In the exemplified method of the present- 
invention, all of the regions in the HlV-i gag gene 
suspected to have inhibitory/instability activity were 
first mutated at once over a region approximately 270 
nucleotides in length using clustered site- directed 
mutagenesis with four different oligonucleotides spanning 
a region of approximately 300 nucleotides to generate the 
construct pl7M1234, described infra , which encodes a 
35 stable mRNA. 
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The four oligonucleotides, which are depicted in 
Fig. 4, are 

Ml : ccagggggaaagaagaagtacaagctaaagcacatcgtatgggcaagcagg 
(SEQ ID NO: 6} ; M2: 

ccttcagacaggatcagaggagcttcgatcactatacaacacagtagc (SEQ ID 
NO: 7); M3 : 

accctctattgtgtgcaccagcggatcgagatcaaggacaccaaggaagc (SEQ ID 
NO: 8) ; and M4: 

gagcaaaacaagt ccaagaagaaggcccagcaggcagcagc t gacacagg (SEQ ID 
NO: 9). These oligonucleotides are 51 (Ml), 48 (M2) , 50 
(M3) and 50 (M4) nucleotides in length. Each 
oligonucleotide introduced several point mutations over an 
area of 19-22 nucleotides ( see infra ) . The number of 
nucleotides 5' to the first mutated nucleotide were 14 
(Ml); 18 (M2); 17 (M3); and 11 (M4) ; and the number of 
15 nucleotides 3' to the last mutated nucleotide were 15 

(Ml>? 8 (M2); 14 (M3) ; and 17 (M4) . The ratios of AT to 
GC nucleotides present in each of these regions before 
mutation was 33AT/18GC (Ml); 30AT/18GC (M2) ; 29AT/21GC 
(M3) and 27AT/23GC (M4) . The ratios of AT to GC 
nucleotides present in each of these regions after 
mutation was 25AT/26GC (Ml); 24AT/24GC (M2) ; 23AT/27GC 
(M3) and 22AT/28GC (M4) . A total of 26 codons were 
changed. The number of times the codon appears in human 
genes per 1000 codons (from T. Maruyama et al., Nuc. Acids 
Res. 14 (Supp.) :rl51-rl97 (1986)) is listed in parentheses 
next to the codon. In the example, 8 codons encoding 
lysine (Lys) were changed f rom aaa (22.0) to aag (35.8); 
two codons encoding tyrosine (Tyr) were changed from tat 
(12.4) to tac (18.4); two codons encoding leucine (Leu) 
were changed from tta (5.9) to eta (6.1); two codons 
encoding histidine (His) were changed from cat (9.8) to 
cac (14.3); three codons encoding isoleucine (lie) were 
changed from ata (5.1) to ate (24.0); two codons encoding 
glutamic acid (Glu) were changed from gaa (26.8) to gag 
(41.6); one codon encoding arginine (Arg) was changed from 
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aga (10.8) to cga (5.2) and one codon encoding arginine 
(Arg) was changed from agg (11.4) to egg (7.7); one codon 
encoding asparagine (Aan) was changed from aat (16.9) to 
aac (23.6); two codons encoding glutamine (Gin) were 
changed from caa (11.5) to cag (32.7); one codon encoding 
serine (Ser) was changed from agt (8.7) to tec (18.7); and 
one codon encoding alanine (Ala) was changed from gca 
(12.7) to gec (29.8) . 

The techniques of oligonucleotide- directed site- 
specific mutagenesis employed to effect the modifications 
in structure or sequence of the DNA molecule are known to 
those of skill in the art. The target DNA sequences which 
are to be mutagenized can be cDNA, genomic DNA or 
synthesized DNA sequences. Generally, these DNA sequences 
are cloned into an appropriate vector, e.g., a 
bacteriophage M13 vector, and single- stranded template DNA 
is prepared from a plaque generated by the recombinant 
bacteriophage. The single-stranded DNA is annealed to the 
synthetic oligonucleotides and the mutagenesis and 
subsequent steps are performed by methods well known in 
the art. gge., e.g., M. Smith and S. Gillam, in Genetic 
Engineering: Principles anri M ethods . Plenum Press 3:1-32 
(1981) (review) and T. Kunkel, Proc. Natl. Acad. Sci. USA 
82:488-492 (1985). See also, Sambrook et al. (1989), 
3HEE&. The synthetic oligonucleotides can be synthesized 
on a DNA synthesizer (e.g.. Applied Biosystems) and 
purified by electrophoresis by methods known in the art. 
The length of the selected or prepared 
oligodeoxynucleo.tides using this method can vary. There 
are no absolute size limits. As a matter of convenience, 
for use in the process of this invention, the shortest 
length of the oligodeoxynucleotide is generally 
approximately 20 nucleotides and the longest length is 
generally approximately 60 to 100 nucleotides. The size 
of the oligonucleotide primers are determined by the 
requirement for stable hybridization of the primers to the 
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regions of the gene in which the mutations are to be 
induced, and by the limitations of the currently available 
methods for synthesizing oligonucleotides. The factors to 
be considered in designing oligonucleotides for use in 
oligonucleotide-directed mutagenesis (e.g., overall size, 
size of portions flanking the mutation (s) ) are described 
by M. Smith and S. Gillam in Genetic Engineering; 
Principles and Methods . Plenum Press 3:1-32 (1981) . In 
general, the overall length of the oligonucleotide will be 
such as to optimize stable, unique hybridization at the 
mutation site with the 5' and 3' extensions from the 
mutation site being of sufficient size to avoid editing of 
the mutation (s) by the exonuclease activity of the DNA 
polymerase. Oligonucleotides used for mutagenesis in the 
present invention will generally be at least about 20 
15 nucleotides, usually about 40 to 60 nucleotides in length 
and usually will not exceed about 100 nucleotides in 
length. The oligonucleotides will usually contain at 
least about five bases 3' of the altered codons. 

In the preferred mutagenesis protocol of the 
present invention, the INS containing expression vectors 
contain the BLUESCRIPT plasmid vector as a backbone. This 
enables the preparation of double- stranded as well as 
single- stranded DNA. Single- stranded uracil containing 
DNA is prepared according to a standard protocol as 
25 follows: The plasmid is transformed into a F' bacterial 
strain (e.g.. DH5aF') . A colony is grown and infected 
with the helper phage M13-VCS [Stratagene #20025; lxlO 11 
pfu/ml] . This phage is used to infect a culture of the E. 
P°lj strain CJ236 and single-stranded DNA is isolated 
according to standard methods. 0.25 ug of single-stranded 
DNA is annealed with the synthesized oligonucleotides (5 
ul of each oligo, dissolved at a concentration of 5 
OD 2fi0 /ml. The synthesized oligonucleotides are usually 
about 40 to 60 nucleotides in length and are designed to 
contain a perfect match of approximately 10 nucleotides at 
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each end. They may contain as many changes as desired 
within the remaining 20-40 nucleotides. The 
oligonucleotides are designed to cover the region of 
interest and they may be next to each other or there may 
be gaps between them. Up to six different 
oligonucleotides have been used at the same time, although 
it is. believed that the use of more than six 
oligonucleotides at the same time would also work in the 
method of this invention. After annealing, elongation 
with T4 polymerase produces the second strand which does 
not contain uracil. The free ends are ligated using 
ligase. This results in double -stranded DNA which can be 
used to transform coli strain HB101. The mutated 
strand which does not contain uracil produces double - 
stranded DNA, which contains the introduced mutations. 
15 individual colonies are picked and the mutations are 

quickly verified by sequence analysis. Alternatively or 
additionally, this mutagenesis method can (and has been) 
used to select for different combinations of 
oligonucleotides which result in different mutant 
phenotypes. . This facilitates the analysis of the regions 
important for function and is helpful in subsequent 
experiments because it allows the analysis of exact 
sequences involved in the INS. in addition to the 
exemplified mutagenesis of the INS-1 region of Hiv-i 
25 described herein, this method has also been used to mutate 
in one step a region of 150 nucleotides using three 
tandemly arranged oligonucleotides that introduced a total 
of 35 mutations. The upper limit of changes is not clear, 
but it is estimated that regions of approximately 500 
nucleotides can be changed in 20% of their nucleotides in 
one step using this protocol. 

The exemplified method of mutating by using 
oligonucleotide-directed site- specif ic mutagenesis may be 
varied by using other methods known in the art. For 
example, the mutated gene can be synthesized directly 
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using overlapping synthetic deoxynucleotides (see, e.g., 
Edge et al., Nature 2^2:756 (1981); Nambair et al.. 
Science 223:1299 (1984); Jay et al. f J. Biol. Chem. 
259:6311 (1984); or by using a combination of polymerase 
chain reaction generated DNAs or cDNAs and synthesized 
oligonucleotides . 

4. Determination of Stability of the 
Mutated mRNA 

The steady state level and/or stability of the 
resultant mutated mRNAs can be tested in the same manner 
as the steady state level and/or stability of the 
unmodified mRNA. containing the inhibitory/ instability 
regions are tested (e.g., by Northern blotting), as 
discussed in section 1, above. The mutated mRNA can be 
15 analyzed along with (and thus compared to) the unmodified 
mRNA containing the inhibitory/ instability region (s) and 
with an unmodified indicator mRNA, if desired. As 
exemplified, the HIV-i pl7«* mutants are compared to the 
unmutated HIV-1 pi?** 8 in transfection experiments' by 
subsequent analysis of the mRNAs by Northern blot 
analysis. The proteins produced by these mRNAs are 
measured by immunoblotting and other methods known in the 
art, such as ELISA. See infra . 

25 VI. INDUSTRIAL APPLICABILITY 

Genes which can be imitated by the methods of 
this invention include those whose mRNAs are known or 
suspected of containing INS regions in their mRNAs. These 
genes include, for example, those coding for growth 
factors, interferons, interleukins , the fos proto- oncogene 
protein, and HIV-1 gag, env and pol, as well as other 
viral mRNAs in addition to those exemplified herein. 
Genes mutated by the methods of this invention can be 
expressed in the native host cell or organism or in a 
different cell or organism. The mutated genes can be 
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introduced into a vector such as a plasmid, cosmid, phage, 
virus or mini -chromosome and inserted into a host cell or 
organism by methods well known in the art. In general, 
the mutated genes or constructs containing these mutated 
genes can be utilized in any cell, either eukaryotic or 
prokaryotic, including mammalian cells (e.g., human (e.g., 
HeLa), monkey (e.g., Cos), rabbit (e.g., rabbit 
reticulocytes), rat, hamster (e.g., CHO and baby hamster 
kidney cells) or mouse cells (e.g., L cells), plant cells, 
yeast cells, insect cells or bacterial cells (e.g., 
csii) . The vectors which can be utilized to clone and/or 
express these mutated genes are the vectors which are 
capable of replicating and/or expressing the mutated genes 
in the host cell in which the mutated genes are desired to 
be replicated and/or expressed. Sej=, e.g., F. Ausubel et 
al " Current Protor.nl.g j n Molprniar Biology , Greene 
Publishing Associates and Wiley- Interscience (1992) and 
Sambrook et al. (1989) for examples of appropriate vectors 
for various types of host cells. The native promoters for 
such genes can be replaced with strong promoters 
compatible with the host into which the gene is inserted. 
These promoters may be inducible. The host cells 
containing these mutated genes can be used to express 
large amounts of the protein useful in enzyme 
preparations, pharmaceuticals, diagnostic reagents, 
25 vaccines and therapeutics. 

Genes altered by the methods of the invention or 
constructs containing said genes may also be used for in^ 
vivo or in-vitro gene replacement. For example, a gene 
which produces an mRNA with an inhibitory instability 
region can be replaced with a gene that has been modified 
by the method of the invention in gitn to ultimately 
increase the amount of protein expressed. Such gene 
include viral genes and/or cellular genes. Such gene 
replacement might be useful, for example, in the 
development of a vaccine and/or genetic therapy. 
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The constructs and/or proteins made by using 
constructs encoding the exemplified altered gag, env, and 
pol genes could be used, for example, in the production of 
diagnostic reagents, vaccines and therapies for AIDS and 
AIDS related diseases. The inhibitory/ instability 
elements in the exemplif ied HIV-l gag gene may be involved 
in the establishment of a state of low virus production in 
the host. HIV-l and the other lentiviruses cause chronic 
active infections that are not cleared by the immune 
system. It is possible that complete removal of the 
inhibitory/ instability sequence elements from the 
lentiviral genome would result in constitutive expression. 
This could prevent the virus from establishing a latent 
infection and escaping immune system surveillance. The 
success in increasing expression of P17** by eliminating 
15 the inhibitory sequence element suggests that one could 

produce lentiviruses without any negative elements. Such 
lentiviruses could provide a novel , approach towards 
attenuated vaccines. 

For exaiqple, vectors expressing high levels of 
Gag can be used in immunotherapy and immunoprophylaxis , 
after expression in humans. Such vectors include 
retroviral vectors and also include direct injection of 
DNA into muscle cells or other receptive cells, resulting 
in the efficient expression of gag, using the technology 
described, for example, in Wolff et al.. Science 247:1465- 
1468 (1990), Wolff et al.. Human Molecular Genetics 
l(6):363-369 (1992) and Ulmer et al., Science 259:1745- 
1749 (1993) . Further, the gag constructs could be used in 
transdominant inhibition of HIV expression after the 
30 introduction into humans. For this application, for 

example, appropriate vectors or DNA molecules expressing 
high levels of p55» or p37« would be modified to generate 
transdom inan t gag mutants, as described, for example, in 
Trono et al., £ell 59:113-120 (1989). The vectors would 
be introduced into humans, resulting in the inhibition of 
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HIV production due to the combined mechanisms of gag 
transdominant inhibition and of immunostimulation by the 
produced gag protein. In addition, the gag constructs of 
the invention could be used in the generation of new 
retroviral vectors based on the expression of lentiviral 
gag proteins. Lentiviruses have unique characteristics 
that may allow the targeting and efficient infection of 
non-dividing cells. Similar applications are expected for 
vectors expressing high levels of env. 

Identification of similar inhibitory/instability 
elements in SIV indicates that this virus may provide a 
convenient model to test these hypotheses. 

The exemplified constructs can also be used to 
simply and rapidly detect and/or further define the 
boundaries of inhibitory/instability sequences in any mRNA 
15 which is known or suspected to contain such regions, e.g., 
in mRNAs encoding various growth factors, interferons or 
interleukins , as well as other viral mRNAs in addition to 
those exemplif ied herein. 

The following examples illustrate certain 
embodiments of the present invention, but should not be 
construed as limiting its scope in any way. Certain 
modifications and variations will be apparent to those 
skilled in the art from the teachings of the foregoing 
disclosure and the following examples, and these are 
intended to be encompassed by the spirit and scope of the 
invention. 
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EXAMPLE 1 
HIV-1 GAG GENE 
The interaction of the Rev regulatory protein of 
human immunodeficiency virus type 1 (HIV-1) with its RNA 
target, named the Rev- responsive element (RRE) , is 
necessary for expression of the viral structure proteins 
(for reviews gee G. Pavlakis and B. Felber, New Biol. 
2:20-31 (1990); B. Cullen and W. -Greene, Cell 51:423-426 
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(1989) ; and C. Rosen and G. Pavlakis, AIDS J. 4:499-509 

(1990) ) . Rev acts by promoting the nuclear export and 
increasing the stability of the RRE - containing mRNAs . 
Recent results also indicate a role for REV in the 
efficient polysome association of these mRNAs (S. Arrigo 
and I. Chen, Gene Dev. 1:808-819 (1991), D. D'Agostino et 
al., Mol. Cell Biol. 12:1375-1386 (1992)). Since the RRE- 
containing HIV-1 mRNAs do not efficiently produce protein 
in the absence of Rev, it has been postulated that these 
mRNAs are defective and contain inhibitory/instability 
sequences variously designated as INS, CRS, or IR (M. 
Emerman et al. Cell 17:1155-1165 (1989); S. Schwartz et 
al., J. Virol. 6^:150-159 (1992); C. Rosen et al., Proc. 
Natl. Acad. Sci. USA 8^:2071-2075 (1988); M. Hadzopoulou- 
Cladaras et al., J. Virol. 63:1265-1274 (1989); F. 

15 Maldarelli et al., J. Virol. ££:5732-5743 (1991); A. W. 
Cochrane et al., J. Virol. £5:5305-5313 (1991)). The 
nature and function of these inhibitory/ instability 
sequences have not been characterized in detail. It has 
been postulated that inefficiently used splice sites may 
be necessary for Rev function (D. Chang and P. Sharp, Cell 
52.:789-795 (1989)); the presence of such splice sites may 
confer Rev- dependence to HIV-i mRNAs. 

Analysis of HIV-l hybrid constructs led to the 
initial characterization of some inhibitory/instability 
25 sequences in the gag and pol regions of HIV-1 (S. Schwartz 
et al., J. Virol. ££:150-159 (1992); F. Maldarelli et al., 
J Virol 65:5732-5743 (1991); A. W. Cochrane et al., J. 
Virol. 6^:5305-5313 (1991)). The identification of an 
inhibitory/instability RNA element located in the coding 
region of the P17** matrix protein of HIV-l was also 
reported (S. Schwartz et al., J. Virol. 66:150-159 
(1992)) . It was shown that this sequence acted in sis to 
inhibit HIV-l tat expression after insertion into a tat 
cDNA. The inhibition could be overcome by Rev- RRE, 
demonstrating that this element plays a role in regulation 
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by Rev. 

1. 017^ exp ression plasmid 
To further study the inhibitory/instability 
element in pl7 8a «, a pl7»* expression plasmid (p!7, Fig. l) 
was constructed. The pll** sequence was engineered to 
contain a translational stop codon immediately after the * 
coding sequence and thus could produce only pl7 w (the 
construction of this plasmid is described below) . The 
major 5' splice site of HIV-l upstream of the gag AUG has 
been deleted from this vector <B. Felber et al., Proc. 
Natl. Acad. Sci. USA 8£: 1495 -1499 (1989)). To investigate 
whether plasmid pl7 could produce pl7*« in the absence of 
Rev and the RRE, pl7 was transfected into HLtat cells (S. 
Schwartz et al., J. Virol.- £4:2519-2529 (1990)) (see 
below) . These cells constitutively produce HIV-l Tat 
protein, which is necessary for transactivation of the 
HIV-l LTR promoter. Plasmid pl7 was transfected in the 
absence or presence of Rev, and the production of pl7*** 
was analyzed by western immunoblotting. The results 
revealed that very low levels of pi7 ea « protein were 
produced (Fig. 2A) . The presence of Rev did not increase 
gag expression, as expected, since this mRNA did not 
contain the RRE. Next, a plasmid that contained both the 
P17*" 8 coding sequence and the RRE (pl7R, Fig. l) was 
constructed. Like pl7, this plasmid produced very low 
levels of P17** in the absence of Rev. High levels of 
pl7*« were produced only in the presence of Rev (Fig. 2A) . 
These experiments suggested that an inhibitory/instability 
element was located in the pl7*»s coding sequence. 

Expression experiments using various eucaryotic 
vectors have indicated that several other retroviruses do 
not contain such inhibitory/ instability sequences within 
their coding sequences (see for example, J. Wills et al., 
J. Virol. 63:4331-43 (1989) and V. Morris et al., J. 
Virol. 62:349-53 (1988)). To verify these results, the 
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pl7*« (matrix) gene of HIV-l in plasmid pl7 was replaced 
with the coding sequence for pl9 w (matrix) which is the 
homologous protein of the Rous sarcoma virus (RSV, strain 
SR-A) . The resulting plasmid, pl9 (Fig. 1) , was identical 
to plasmid pl7, except for the gag coding sequence. The 
production of $19™ protein from plasmid p!9 was analyzed 
by western immunoblotting, which revealed that this 
plasmid produced high levels of p!9*"* (Fig, 2A) . These 
experiments demonstrated that the P19* 8 * coding sequence of 
RSV, in contrast to pl7*« of HIV-l, could be efficiently 
expressed in this vector, indicating that the gag region 
of RSV did not contain any inhibitory/instability 
elements. A derivative of plasmid pl9 that contained the 
RRE, named p!9R (Fig. l) was also constructed. 
Interestingly, only very low levels of p!9 w protein were 
produced from the RRE- containing plasmid pl9R in the 
absence of Rev. This observation indicated that the 
introduced RRE and 3' HIV-l sequences exerted an 
inhibitory effect on pl9*« expression from plasmid pl9R, 
which is in agreement with recent data indicating that in 
the absence of Rev, a longer region at the 3' end of the 
virus including the RRE acts as an inhibitory/ instability . 
element (G. Nasioulas, G. Pavlakis, B. Felber, manuscript 
in preparation) . In conclusion, the high levels of 
expression of RSV pl9 M in the same vector reinforced the 
conclusion that an inhibitory/instability sequence within 
HIV-l pl7*"* coding region was responsible for the very low 
levels of expression. 

It was next determined whether the 
inhibitory/instability effect of the pl7** coding sequence 
was detected also at the mRNA level. Northern blot 
analysis of RNA extracted from HLtat cells transfected 
with pl7 or transfected with pl7R demonstrated that pl7R 
produced lower mRNA levels in the absence of Rev (Fig. 3A) 
(See Example 3) . A two- to eight- fold increase in pl7R 
mRNA levels was observed after coexpression with Rev. 
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Plasmid pl7 produced mRNA levels similar to those produced 
by pl7R in the absence of Rev. Notably, Rev decreased the 
levels of mRNA and protein produced by mRNAs that do not 
contain RRE . This inhibitory effect of Rev in 
cotransfection experiments has been observed for many 
5 other non- RRE -containing mRNAs, such as lucif erase and CAT 
(L. Solomin et al., J. Virol 64:6010-6017 (1990); D. M. 
Benko et al., New Biol 2:1111-1122 (1990)). These results 
established that the inhibitory element in gag also 
affects the mRNA levels and are in agreement with previous 
10 findings (S. Schwartz et al., J. Virol. 6£:150-159 

(1992)). Quantitations of the mRNA and protein levels 
produced by pl7R in the absence, or presence of Rev were 
performed by scanning densitometry of appropriate serial 
dilutions of the samples, and indicated that the 
difference was greater at the level of protein (60- to 
100-fold) than at the level of mRNA (2- to 8- fold) . This 
result is compatible with previous findings of effects of 
Rev on mRNA localization and polysomal loading of both gag 
and env mRNAs (S. Arrigo et al., Gene Dev £:808-819 
(1991); D. D'Agostino et al., Mol. Cell. Biol. 12:1375- 
1386 (1992); M. Emerman et al., Cell 57:1155-1165 (1989); 
B. Felber et al., Proc. Natl. Acad. Sci. USA 8£: 1495-1499 
(1989), M. Malim et al., Nature (London) 338:254-257 
(1989)). Northern blot analysis of the mRNAs produced by 
the RSV gag expression plasmids revealed that pl9 produced 
high mRNA levels (Fig. 3B) . This further demonstrated 
that the pl9 w coding sequence of RSV does not contain 
inhibitory elements. The presence of the RRE and 3' HIV-l 
sequences in plasmid pl9R resulted in decreased mRNA 
levels in the absence of Rev, further suggesting that 
inhibitory elements were present in these sequences. 
Taken together, these results established that gag 
expression in HIV-l i9 fundamentally different from that 
in RSV. The HIV-l pi7« coding sequence contains a strong 
inhibitory element while the RSV pi9*« coding sequence 
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does not. Interestingly, plasmid pl9 contains the 5' 
splice site used to generate the RSV env mRNA, which is 
located downstream of the gag AUG. This 5' splice site is 
not utilized in the described expression vectors (Fig. 
3B) . ' Mutation of the invariable GT dinucleotide of this 
5' splice site to AT did not affect $19** expression 
significantly (data not shown) . On the other hand, the 
HIV-l pi7 expression plasmid did not contain any known 
splice sites, yet was not expressed in the absence of Rev. 
These results further indicate that sequences other than 
inefficiently used splice sites are responsible for 
i nhib ition of gag expression. 



2. Mutated 017™ vectors? 
To investigate the exact nature of the 
15 i nhib itory element in HIV-l gag, site-directed mutagenesis 
of the pll** coding sequence with four different 
oligonucleotides, as indicated in Fig. 4, was performed. 
Each oligonucleotide introduced several point mutations 
over an area of 19-22 nucleotides. These mutations did 
not affect the amino acid sequence of the $17** protein, 
since they introduced silent codon changes. First, all 
four oligonucleotides were used simultaneously in 
mutagenesis using a single- stranded DNA template as 
described (T. Kunkel, Proc. Natl. Acad. Sci. USA 82:488- 
25 492 (1985); S. Schwartz et al., Mol. Cell. Biol. 12:207- 
219 (1992)). This allowed the simultaneous introduction 
of many point mutations over a large region of 270 nt in 
vector pl7. A mutant containing all four oligonucleotides 
was isolated and named pl7M1234. Compared to pl7 p this 
plasmid contained a total of 28 point mutations 
distributed primarily in regions with high ATT- content ■ 
The phenotype of the mutant was assessed by transfections 
into HLtat cells and subsequent analysis of pl7 w 
expression by immunoblotting. Interestingly, pl7M1234 
produced high levels of pl7« protein, higher than those 
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produced by pl7R in the presence of Rev (Fig. 2A) . This 
result demonstrated that the inhibitory/instability 
signals in pl7** mRNA had been inactivated in plasmid 
P17M1234. As expected, the presence of Rev protein did 
not increase expression from pl7M1234, but instead, had a 
slight inhibitory effect on gag expression. Thus, pl7 Bag 
expression from the mutant pl7M1234 displayed the same 
general properties as the pl9*°* of RSV, that is, a high 
constitutive level of Rev- independent gag expression. 
Northern blot analysis revealed that the mRNA levels 
produced by pl7M1234 were increased compared to those 
produced by pl7 (Pig. 3 A) . 

To further examine the nature and exact location 
of the minimal inhibitory/ instability element, the pll** 
coding sequence in plasmid pl7 was mutated with only one 
of the four mutated oligonucleotides at a time. This 
procedure resulted in four mutant plasmids, named pl7Ml, 
p!7M2, pl7M3, and pl7M4, according to the oligonucleotide 
that each contains. None of these mutants produced 
significantly higher levels of pl7 w protein compared to 
plasmid pl7 (Fig. 5), indicating that the 
inhibitory/instability element was not affected. The pl7 
coding sequence was next mutated with two oligonucleotides 
at a time. The resulting mutants were named pl7M12, 
pl7M13, pl7M14, pl7M23, pl7M24, and pl7M34. Protein 
production from these mutants was minimally increased 
compared with that from pl7, and it was considerably lower 
than that from pl7M1234 (Fig. 5). In addition, a triple 
oligonucleotide mutant, pl7M123, also failed to express 
high levels of pl7« (data not shown) . These findings may 
suggest that multiple inhibitory/instability signals are 
present in the coding sequence of pl7*«. Alternatively, a 
single inhibitory/instability element may span a large 
region, whose inactivation requires mutagenesis with more 
than two oligonucleotides. This possibility is consistent 
with previous data suggesting that a 218 -nucleotide 
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inhibitory/ instability element in the pll™ coding 
sequence is required for strong inhibition of gag 
expression. Further deletions of this sequence resulted 
in gradual loss of inhibition (S. Schwartz et al., J. 
Virol. £6:150-159 (1992)). The inhibitory/ instability 
element may coincide with a specific secondary structure 
on the mRNA. It is currently being investigated whether a 
specific structure is important for the function of the 
inhibitory/ instability element. 

The pi?*** coding sequence has a high content of 
A and U nucleotides, unlike the coding sequence of pl9*** 
of RSV (S. Schwartz et al., J. Virol. 66:150-159 (1992); 
G. Myers and G. Pavlakis, in The Retroviridag j. Levy, 
Eds. (Plenum Press, New York, NY, 1992), pp. 1-37). Four 
regions with high AU content are present in the pl7 w 
15 coding sequence and have been implicated in the inhibition 
of gag expression (S. Schwartz et al., J. Virol. 66:150- 
159 (1992) ) . Lentiviruses have a high AU content conpared 
to the m a mmal ian genome. Regions of high AU content are 
found in the gag/pol and env regions, while the multiply 
spliced mRNAs have a lower AU content (G. Myers and G. 
Pavlakis, in The Retrovirid*^ j. Levy, Eds. (Plenum 
Press, New York, NY, 1992), pp. 1-37), supporting the 
possibility that the inhibitory/instability elements are 
associated with mRNA regions with high AU content. It has 
been shown that a specific oligonucleotide sequence, 
AUUUA, found at the AU-rich 3' untranslated regions of 
some unstable mRNAs, may confer RNA instability (G. Shaw 
and R. Kamen, Cell 46:659-667 (1986)). Although this 
sequence is not present in the pi? 8 " 5 sequence, it is found 
in many copies within gag/pol and env regions. The 
association of instability elements with AU-rich regions 
is not universal, since the RRE together with 3' HIV 
sequences, which shows a strong inhibitory/instability 
activity in our vectors, is not AU-rich. These 
observations suggest the presence of more than one type of 
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inhibitory/ instability sequences. In addition to reducing 
the AU content, some of the mutations introduced in 
plasmid pl7 changed rarely used codons to more favored 
codons for human cells. Although the use of rare codons 
could be an alternative explanation for poor HIV gag 
expression, this type of translational regulation is not 
favored by these results, since the presence of Rev 
corrects the defect in gag expression, in addition, the 
observation that the presence of non- translated sequences 
reduced gag expression (for example, the RRE sequence in 
pl7R) , suggests that translation of the 
inhibitory/ instability region is not necessary for 
inhibition. Introduction of RRE and 3' HIV sequences in 
P17M1234 was also able to decrease gag expression, 
verifying that independent negative elements not acting 
!5 co-translationally are responsible for poor expression. 

3. Identification and elimination of 

additional INS sequences in the p24 and pl5 ' 
regions of the gag crsnp 

To examine the effect of removal of INS in the 
pl7« coding region (the pl7« coding region spans 
nucleotides 336-731, as described in the description of . 
Fig. l. (B) above, and contains the first of three parts 
(i.e., pl7, p24, and pl5) of the gag coding region, as 
25 indicated on in Fig. l. (A) and (B) ) on the expression of 
the complete gag gene expression vectors were constructed 
in which additional sequences of the gag gene were 
inserted 3' to the mutationally altered pl7« coding 
region, downstream of the stop codon, of vector pi7Mi234. 
Three vectors containing increasing lengths of gag 
sequences were studied: pi7Ml234 (731-1081) , pl7Ml234 (731- 
1424) and pi 7M1234 (731-2165), as shown in Fig. l. (C) . 
Levels of expression of pl7« were measured, with the 
results indicating that region of the mRNA encoding the 
35 second part of the gag protein (i.e., the part encoding 
the p24« protein, which spans nucleotides 731-1424) 



20 



30 



WO 93/20212 



PCI7US93/02908 



- 52 - 

contains only a weak INS, as determined by a small 
reduction in the amount of P17*** protein expressed by 
p!7M1234 as compared with the amount of pll*** protein 
expressed by pl7M1234 (731-1424) , while the region of the 
mRNA encoding the third part of the gag protein (i.e., the 
5 part encoding the pis 8 ** protein, which spans nucleotides 

1425-2165) contains a strong INS, as determined by a large 
reduction in the amount of gag protein expressed by 
pi 7M1234 (731-2165) as compared with the amount of protein 
expressed by pl7M1234 and pl7M1234 (731-1424) . 

10 

4. P37M1234 vector 

The above analysis allowed the construction of 
vector p37M1234, which expressed high levels of p37 e * 
precursor protein (which contains both the pl7 eas and p24 w 
^ protein regions) . Vector p37M1234 was constructed by 
removing the stop codon at the end of the gene encoding 
the altered pl7 re protein and fusing the nucleotide 
sequence encoding the p24 w protein into the correct 
reading frame by oligonucleotide mutagenesis. This 

20 

restored the nucleotide sequence so that it encoded the 
fused pl7« and p24*« protein (i.e., the p37« protein) as 
it is encoded by HIV-l. Since the presence of the p37^ 
or of the P24* 1 * protein can be quantitated easily by 
^ commercially available ELISA kits, vector p37M1234 can be 
used for inserting and testing additional fragments 
suspected of containing INS. Examples of such uses are 
shown below. 



5- Vectors P17M1234 (731 -10811 NS and P55BM1234 

Other vectors which .were constructed in a 
similar manner as was P37M1234 were pl7M1234 (731-1081) NS 
and p55BM1234 (Pig. l. (c)). The levels of gag expression 
from each of these three vectors which allow the 
translation of the region downstream (3') of the pl7 
coding region, was respectively similar to the level of 
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gag expression from the vectors containing the nucleotide 
sequences 3' to a stop codon (i.e., vectors pl7M1234 (731- 
1081), pl7M1234 (731-1424) and pl7M1234 (731-2165) , 
described above) . These results also demonstrate that the 
INS regions in the gag gene are not affected by 
translation or lack thereof through the INS region. These 
results demonstrate the use of pl7Ml234 to detect 
additional INS sequences in the HlV-l gag coding region 
(i.e., in the 1424-2165 encoding region of HIV-l gag). 
Thus, these results also demonstrate how a gene containing 
one or more inhibitory/instability regions can be mutated 
to eliminate one inhibitory/instability region and then 
used to further locate additional inhibitory/instability 
regions within that gene, if any. 

6- Vectors r)37Mi-mr> an d nSSMi-ip 
As described above, experiments indicated the 
presence of INS in the p24 and pl5 region of HIV-l in 
addition to those identified and eliminated in the pl7«* 
region of HIV-l. This. is depicted schematically in Figure 
6 on page 7180 of Schwartz et al., J. Virol. 66:7176-7182 
(1992). in that figure, cgagM1234 is identical to 
P55BM1234. 

By studying the expression of p24» protein in 
vectors encoding the p24" protein containing additional 
gag and pol sequences, it was found that vectors that 
contained the complete gag gene and part of the pol gene 
(e.g. vector p55BM1234, age. Fig. 6) were not expressed at 
high levels, despite the elimination of INS-1 in the pl7"« 
region as described above. The inventors have 
hypothesized that this is caused by the presence of 
multiple INS regions able to act independently of each 
other. To eliminate the additional INS, several mutant 
HIV-l oligonucleotides were constructed (see Table 2) and 
incorporated in various gag expression vectors. For 
example, oligonucleotides M6gag, M7gag, MSgag and MIOgag 
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were introduced into p37M1234, resulting in p37Ml-10D and 
the same oligonucleotides were introduced into p55BM1234, 
resulting in p55BMl-l0. These experiments revealed a 
dramatic improvement of expression of p37 p * (which is the 
pl7«* and P24* 1 * precursor) and p55** (which is the intact 
gag precursor molecule produced by HIV-l) upon the 
incorporation in the expression vectors p37M1234 and 
P55BM1234 of additional mutations . contained in the 
oligonucleotides MSgag, M7gag, M8gag and MIOgag (described 
in Table 2) . Fig. 6 shows that expression was 
dramatically improved after the introduction of additional 
mutations . 

Of particular interest was p37Ml-lOD, which 
produced very high levels of gag. This has been the 
highest producing gag construct (see Fig. S) . 
Interestingly, addition of gag and pol sequences as in 
vectors p55BMl-10 and p55AMl-l0 (Fig. 6) reduced the 
levels of gag expression. Upon further mutagenesis, the 
inhibitory effects of this region were partially 
eliminated as shown in Fig. 6 for vector p55Ml-13P0. 
Introduction of mutations defined by the gag region 
nucleotides MIOgag, Mllgag, M12gag f M13gag, and pol region 
nucleotide MOpol increased the levels of gag expression 
approximately six fold over vectors such as p55BMl-10. 

The HIV-l promoter was replaced by the human 
cytomegalovirus early promoter (CMV) in plasmids p37Ml-10D 
and p55Ml-13P0 to generate plasmids pCMV37Ml-10D and 
pCMV55Ml-13P0, respectively. For this, a fragment 
containing the CMV promoter was amplified by PCR 
(nucleotides -670 to +73, where +1 is the start of 
transcription, .gea, Boshart, et al., Cell , 41, 521 
(1985)) . This fragment was exchanged with the StuI - 
BssHII fragment in gag vectors p37Ml-10D and p55Ml-13P0, 
resulting in the replacement of the HIV-l promoter with 
that of CMV. The resulting plasmids were con^ared to 
those containing the HIV-l promoter after transfection in 
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human cells, and gave similar high expression of gag. 
Therefore, the high expression of gag can be achieved in 
the total absence of any other viral protein. The 
exchange of the. HIV- l with other promoters is beneficial 
if constitutive expression is desirable and also for 
expression in other mammalian cells, such as mouse cells, 
in which the HIV-l promoter is weak. 

The constructed vectors p37Ml-10D and p55BMl-10 
can be used for the Rev independent production of #37** 
and p55««« proteins, respectively, in addition, these 
vectors can be used as convenient reporters, to identify 
and eliminate additional INS in different RNA molecules. 

Using the protocols described herein, regions 
have been identified within the gp41 (the transmembrane 
part of HIV-l env) coding area and at the post -env 3' 
!5 region of HIV-l which contain INS . The elimination of INS 
from gag, pol and env regions will allow the expression of 
high levels of authentic HIV-l structural proteins in the 
absence of the Rev regulatory factor of HIV-l. The 
mutated coding sequences can be incorporated into 
appropriate gene transfer vectors which may allow the 
targeting of specific cells and/or more efficient gene 
transfer. Alternatively, the mutated coding sequences can 
be used for direct expression in human or other cells is 
Yi£sa or is viva with the goal being the production of 
high protein levels and the generation of a strong immune 
response. The ultimate goal in either case is subsequent 
protection from HIV infection and disease. 

The described experiments demonstrate that the 
inhibitory/instability sequences are required to prevent 
HIV-l expression. This block to the expression of viral 
structural proteins can be overcome by the Rev-RRE 
interaction. In the absence of INS, HIV-l expression 
would be similar to simpler retroviruses and would not 
require Rev. Thus, the INS is a necessary component of 
Rev regulation. Sequence comparisons suggest that the INS 
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elem e nt identified here is conserved in all HIV-l 
isolates, although this has not been verified 
experimentally. The majority (22 of 28) of the mutated 
nucleotides in gag are conserved in all HIV-l isolates, 
while 22 of 28 are conserved also in HIV- 2 (G. Myers, et 
al., Eds. Human retroviruse s and AIDS. A compilation and 
analysis of nucleic acid and amino acid sequences (Los 
Alamos National Laboratory, Los Alamos, New Mexico, 1991), 
incorporated herein by reference) . Several lines of 
evidence indicate that all lentiviruses and other complex 
retroviruses . such as the HTLV group contain similar INS 
regulatory elements. Strong INS elements have been 
identified in the gag region of HTLV- 1 and SIV (manuscript 
in preparation) . This suggests that INS are important 
regulatory elements, and may be responsible for some of 
15 the biological characteristics of the complex 

retroviruses. The presence of INS in SIV and HTLV- 1 
suggests that these elements are conserved among complex 
retroviruses. Since INS inhibit expression, it must be 
concluded that their presence is advantageous to the 
virus, otherwise they would be rapidly eliminated by 
mutations . 

The observations that the inhibitory/instability 
sequences act in the absence of any other viral proteins 
and that they can be inactivated by mutagenesis suggest 
that these elements may be targets for the binding of 
cellular factors that interact with the mRNA and inhibit 
post transcriptional steps of gene expression. The 
interaction of HIV-l mRNAs with such factors may cause 
nuclear retention, resulting in either further splicing or 
rapid degradation of the mRNAs. It has been proposed that 
components of the splicing machinery interact with splice 
sites in HIV-l mRNAs and modulated mRNA expression (A. 
Cochrane et al., J. Virol. 65:5305-5313 (1991); D. Chang 
and P. Sharp, Cell 5£:789-795 (1989); X. Lu et al., Proc. 
35 Natl. Acad. Sci. USA 87:7598-7602 (1990)). However, it is 
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not likely that the inhibitory/instability elements 
described here are functional 5' or 3' splice sites. 
Thorough mapping of HIV-l splice sites performed by 
several laboratories using the Reverse Transcriptase- PCR 
technique failed to detect any splice sites within gag (S. 
Schwartz et al. f J. Virol. £4:2519-2529 (1990); J. 
Guatelli et al., J. Virol. £1:4093-4098 (1990); E. D. 
Gerrett et al., J. Virol. 65:1653-1657 (1991); M. Robert - 
Guroff et al., J. Virol. 64:3391-3398 (1990); S. Schwartz 
et al., J. Virol. £4:5448-5456 (1990); S. Schwartz et 
al., Virology 113:677-686 (1991)). The suggestions that 
Rev may act by dissociating unspliced mRNA from the 
splicesomes (D. Chang and P. Sharp, Cell 59:789-795 
(1989)) or by inhibiting splicing (J. Kjems et al.. Cell 
£2:169-178 (1991)) are not easily reconciled with the 
knowledge that all retroviruses produce structural 
proteins from mRNAs that contain unutilized splice sites. 
Splicing of all retroviral mRNAs, including HIV-l mRNAs in 
the absence of Rev, is inefficient compared to splicing of 
cellular mRNAs (J. Kjems et al., Cell £2:169-178 (1991); 
A. Krainer et al., Gene Dev. 4:1158-1171 (1990); R. Katz 
and A. Skalka, Mol. Cell. Biol. lp_:696-704 (1990); C. 
Stoltzfus and S. Pogarty, J. Virol. £3:1669-1676 (1989)). 
The majority of the retroviruses do not produce Rev- like 
proteins, yet they efficiently express proteins from 
partially spliced mRNAs, suggesting that inhibition of 
expression by unutilized splice sites is not a general 
property of retroviruses. Experiments using constructs 
expressing mutated HIV-l gag and env mRNAs lacking 
functional splice sites showed that only low levels of 
these mRNAs accumulated in the absence of Rev and that 
their expression was Rev- dependent (M. Emerman et al., 
Cell £7:1155-1165 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA ££: 1495 -1499 (1989); M. Malim et al . , 
Nature (London) 23£:254-257 (1989)). This led to the 
conclusion that Rev acts independently of splicing (B. 



WO 93/20212 



PCT/US93/02908 



10 



- 58 - 

Felber et al., Proc. Natl. Acad. Sci. USA 86:1495-1499 
(1989); M. Malim et al., Nature (London) 338 :254-257 
(1989)) and to the proposal that inhibitory/instability 
elements other than splice sites are present on HTV-l 
raRNAs (C. Rosen et al., Proc. Natl. Acad. Sci. USA 
85:2071-2075 (1988); M. Hadzopoulou-Cladaras , et al., J. 
Virol.* £3:1265-1274 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA 86.: 1495-1499 (1989)). 

Constructi on of the Gag Expression Plasmids 
Plasmid pl7R has been described as pNL17R (S. 
Schwartz et al., J. Virol. £6:150-159 (1992)). Plasmid 
pl7 was generated from pl7R by digestion with restriction 
enzyme Asp718 followed by religation. This procedure 
deleted the RRE and HIV-l sequences spanning nt 8021-8561 
15 upstream of the 3' LTR. To generate mutants of P17* 1 *, the 
p!7 w coding sequence was subcloned into a modified 
pBLUESCRIPT vector (Stratagene) and generated single 
stranded uracil -containing DNA. Site-directed mutagenesis 
was performed as described (T. Kunkel, Proc. Natl. Acad. 
Sci. USA 82:488-492 (1985);. S. Schwartz et al., Mol. Cell 
Biol. 12:207-219 (1992))/ Clones containing the 
appropriate mutations were selected by sequencing of 
double- stranded DNA. To generate plasmid pl9R, plasmid 
pl7R was first digested with BssHII and EcoRI, thereby 
deleting the entire pl7*** coding sequence, six nucleotides 
upstream of the pl7™ AUG and nine nucleotides of linker 
sequences 3' of the pl7*« stop codon. The pl7** coding 
sequence in pl7R was replaced by a PGR -amplified DNA 
fragment conta inin g the RSV pl9*"* coding sequence (R. 
Weiss et al., RNA Tumor Viruses. Molecular Biology of 
Tumor Viruses (Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1985)) . This fragment contained eight 
nucleotides upstream of the RSV gag AUG and the pl9 w 
coding sequence immediately followed by a translational 
stop codon- The RSV gag fragment was derived form the 
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infectious RSV proviral clone S-RA (R. Weiss et al., RNA 

Tumor Viruses . Molecular Biology of Tumor vir,, H g n (Co ld 

Spring Harbor Laboratory, Cold Spring Harbor, New York, 
1985) ) . pl9 was derived from pl9R by excising an Asp 718 
fragment containing the RRE and 3' HIV-1 sequences 
spanning nt 8021-8561. 



10 



Transfection of HLtat CelJs With nan Exnr PSB i m Plasmirig 
HLtat cells (S. Schwartz et al., J. Virol. 
£1:2519-2529 (1990)) were transfected using the calcium 
coprecipitation technique (P. Graham et al. and A. Van der 
Bb, Virology 52:456-460 (1973)) as described (B. Pelber et 
al., Proc. Natl. Acad. Sci. USA &6_: 149 5 -149 9 (1989)), 
using 5 ^g of P 17, pl7R, pl7M1234, pl9, or pl9R in the 
absence (-) or presence (+) of 2 M g of the Rev- expressing 
15 piasmid pL3crev (B. Felber et al., Proc. Natl. Acad. Sci. 
USA S£:1495-1499 (1989)). The total amount of DNA in 
transfections was adjusted to 17 /xg per 0.5 ml of 
precipitate per 60 mm plate using pUC19 carrier DNA. 
Cells were harvested 20 h after transfected and cell 
extracts were subjected to electrophoresis on 12.5% 
denaturing polyacrylamide gels and analyzed by 
immune-blotting using either human HIV-1 patient serum 
(Scripps) or a rabbit anti-pl9« serum. pRSV-lucif erase 
(J. de Wet et al., Mol. Cell. Biol. 2:725-737 (1987)) that 
contains the firefly luciferase gene linked to the RSV LTR 
promoter, was used as an internal standard to control for 
transfection efficiency and was quantitated as described 
(L. Solomin et al., J. Virol. £1:6010-6017 (1990) ) . The 
results are set forth in Fig. 2. 
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Northern Blot AnaTy^p 

HLtat cells were transfected as described above 
and harvested 20 h post transfection. Total RNA was 
prepared by the heparin/DNase method (Z. Krawczyk and C. 
Wu, Anal. Biochem. l£l:20-27 (1987)), and 20 fig of total 
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RNA was subjected to northern blot analysis as described 
(M. Hadzopoulou-Cladaras et al. f J. Virol. 63:1265-1274 
(1989)) . The filters were hybridized to a nick- translated 
PCR- amplified DNA fragment spanning nt 8304-9008 in the 
HIV-l 3' LTR. The results are set forth in Fig. 3. 

5 

EXAMPLE 2 
HIV-l ENV GENE 
Fragments of the env gene were inserted into 
vectors pl9 or p37M1234 and the expression of the 

0 resulting plasmids were analyzed by transfections into 
HLtat cells. It was found that several fragments 
inhibited protein expression. One of the strong INS 
identified was in the fragment containing nucleotides 
8206-8561 ("fragment [8206-8561]"). To eliminate this 

5 INS, the following oligonucleotides were synthesized and 
used in mutagenesis experiments as specified supra . The 
fragment was derived from the molecular clone pNL43, which 
is almost identical to HXB2. The numbering system used 
herein follows the numbering of molecular clone HXB2 

0 throughout. The synthesized oligonucleotides follow the 
pNL4 3 s equence . 

The oligonucleotides which were used to 
mutagenize fragment [8206-8561] , and which made changes in 
the env coding region between nucleotides 8210-8555" (the 

5 letters in lower case indicate mutated nucleotides) were: 



#1: 

8194-8261 

GAATAGTGCTGTTAACcTcCT 
30 AGGGTTATAG (SEQ ID NO: 10) 

#2 

8262-8323 

AAGTATTACAAGCcGCcTAccGcGCcATcaGaCAtATcCCccGccGcATccGcCAGGG 
35 CTTG (SEQ ID NO: 11) 
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#3 

8335-8392 

GCTATAAGATGGGcGG t AAaTGGagcAAg t c etc cGTcATcGGcTGGCCTGCTGTAAG 
(SEQ ID NO: 12) 

5 #4 

8393-8450 

GGAAAGAATGcGc aGgGC cGAaC C cGC cG CcGAcGGaGT t GG cGCcGTATCTCGAGAC 
(SEQ ID NO: 13) 

10 #5 

8451-8512 

CTAGAAAAACAcGGcGCc AT t ACc t C C t C t AAcAC cGC cGC cAAt AAcGCcGCTTGTG 
CCTG (SEQ ID NO: 14) 

15 #6 

8513-8572 

GCTAGAAGCACAgGAaGAaGAgGAaGTcGGcTTcCCcGTt^ 
AG (SEQ ID NO: 15) 

^ The expression of env was increased by the 

elimination of the INS in fragment [8206-8561] as 
determined by analysis of both mRNA and protein. 

To further characterize in detail the INS in 
HIV-1 env, the coding region of env was divided into 

25 different fragments, which were produced by PCR using 
appropriate synthetic oligonucleotides, and cloned in 
vector p37Ml-10D. This vector was produced from p37M1234 
by additional mutagenesis as described above. After 
introduction into human cells, vector p37Ml-l0D produces 

30 . high levels of p37*«* protein. Any strong INS element will 
inhibit the expression of gag if ligated in the same 
vector. The summary of the env fragments used is shown in 
Figure 11. The results of these experiments show that, 
like in HIV-1 gag, there exist multiple regions inhibiting 

35 expression in HIV-l env, and combinations of such regions 
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result in additive or synergistic inhibition. For 
example, while fragments 1, 2 f or 3 individually inhibit 
expression by 2-6 fold, the combination of these fragments 
inhibits expression by 30 fold. Based on these results, 
additional mutant oligonucleotides have been synthesized 
for the correction of env INS. These oligonucleotides 
have been introduced in the expression vectors for HIV-i 
env pl20pA and pl20R270 {see Fig. 7) for the development 
of Rev- independent HIV-l env expression plasmids as 
discussed in detail below. 

1. The mRNAs for gpl60 and for the 

extracellular domain (gpl20) are defective 
and their expression depends on the 
presence of RRE in cis in t-yang 



1.1 Positive and Negative Determinants for 
env mRNA Expressio n of HIV 



Previous experiments on the identification and 
characterization of the env expressing cDNAs had 
demonstrated that Env is produced from mRNAs that contain 
exon 4AE, 4BE, or 5E. (Schwartz et al., J. Virol. 
64:5448-5456 (1990); Schwartz et al., Mol. Cell. Biol. 
12:207-219 (1992) . All constructs generated to study the 
determinants of env expression are derived from pNL15E. 
This plasmid contains the HIV-l LTR promoter, the complete 
env cDNA 15E, and the HIV 3' LTR including the 
polyadenylation signal (Schwartz, et al. J. Virol. 
64:5448-5456 (1990) (Fig. 7) . pNL15E was generated from 
the molecular clone pNL4-3 (pNL4-3 is identical to pNL43 
herein) (Adachi et al., J. Virol. 59:284-291 (1986) and 
lacks the splice acceptor site for exon 6D, which was used 
to generate the tev mRNA (Benko et al., J. Virol. 64:2505- 
2518 (1990) . The Env expression plasmids were transfected 
35 in the presence or absence of the Rev- expressing plasmid 

pL3crev (Felber et al., J. Virol. 64:3734-3741 (1990) into 
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HLtat cells (Schwartz et al. # J. Virol. 64:2519-2529 
(1990) , which constitutively express Tat (one-exon Tat) . 
One day later, the cells were harvested for analyses of 
RNA and protein. Total RNA was extracted and analyzed on 
Northern blots. Protein production was measured by 
Western blots to detect cell-associated Env. m the 
absence of Rev, NL15E mRNA was efficiently spliced and 
produced Nef ; in the presence of Rev, most of the RNA 
remained unspliced and produces the Env precursor gpl60, 
which is processed to gpl20, the secreted portion of the 
precursor and gp4l. 

To allow for the effects of INS to be 
distinguished and studied separately from splicing, splice 
sites known to exist within some of the fragments used 
were eliminated as discussed below. Analysis of the 
resulting expression vectors included size determination 
of the produced mRNA, providing the verification that 
splicing does not interfere with the interpretation of the 
data. 

20 1.2 Env expression is Rev-dependent also 

in the absence of functional splice 
sites 

To study the effect of splicing on env 
expression, the splice donor at nt 5592 was removed by 

25 site-directed mutagenesis (changing GCAGTA to GaAtTc , and 
thus introducing an EcoRI site) , which resulted in plasmid 
15ESD- (Fig. 7) . The mRNA from this construct was 
efficiently spliced and produced a small mRNA encoding Nef 
(Fig. 8) . Sequence analysis revealed that this spliced 

30 mRNA was generated by the use of an alternative splice 
donor located at nt 5605 (TACATgtaatg) and the common 
splice acceptor site at nt 7925. in contrast to published 
work (Lu et al., Proc. Natl. Acad. Sci. USA 87:7598-7602 
(1990), expression of Env from this mutant depended on 

35 Rev. Next, the splice acceptor site was mutated at nt 

7925. Since previous cDNA cloning had revealed that in 
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addition to the splice acceptor site at nt 7925 there are 
two additional splice acceptor sites at nt 7897 and nt 
7901 (Schwartz, et al. J. Virol. 64:2519-2529 (1990), this 
region of 43 bp encompassing nt 7884 to nt 7926 was 
removed. This resulted in plSEDSS (Fig. 7} . Northern 
blot analysis of mRNA from HLtat cells transf ected with 
this construct confirmed that the 15EDSS mRNA is not 
spliced (Fig. 8B) . Although all functional splice sites 
have been removed from plSEDSS , Rev is still required for 
Env production (Fig. 8A) . Taken together with data 
obtained by studying gag expression, these results suggest 
that the presence of inefficiently used splice sites is 
not the primary determinant for Rev- dependent Env 
expression. It is known that at least two unused splice 
sites are present in this mRNA (the alternative splice 
donor at nt 5605 and the splice donor of exon 6D at nt 
6269). Therefore, it cannot be ruled out that initial 
spliceosome formation can occur, which does not lead to 
the execution of splicing. It is possible that this is 
sufficient to retain the mRNA in the nucleus and, since no 
splicing occurs, that this would lead to degradation of 
the mRNA. Alternatively, it is possible that 
splice- site- independent RNA elements similar to those 
identified within the gag/pol region (INS) are responsible 
for the Rev dependency (Schwartz et al., J. Virol. 
25 66:7176-7182 (1992); Schwartz et al., J. Virol. 66:150- 
159 (1992) . 

1.3 Identification of negative elements 
within a pl20 mRNA 

To distinguish between these possibilities, a 
series of constructs were designed that allowed the 
determination of the location of such INS elements. 
First, a stop codon followed by the restriction sites for 
Nrul and Mlul was introduced at the cleavage site between 
the extracellular gpl20 and the transmembrane protein gp4l 
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at nt, 7301 in plasmid NL15EDSS , resulting in pl20DSS (Fig. 
7) . Immunoprecipitation of gpl20 from the medium of cells 
transfected with pl20DSS confirmed the production of high 
levels of gpl20 only in the presence of Rev (Pig. 9B) . 
The release of gpl20 is very efficient, since only barely 
detectable amounts remain associated with the cells (data 
not shown) . This finding rules out the possibility that 
the translation of the gp4l portion of the env cDNA is 
responsible for the defect in env expression. Next, the 
region 3' of the stop codon of gpi20 (consisting of gp4l, 
including the RRE and 3' LTR) with the SV40 
polyadenylation signal (Fig. 7) was replaced. This 
construct, P 120pA, produced very low levels of gpi20 in 
the absence of Rev (Fig. 9 B) . Background levels of Env 
were produced from pl20DR (Fig. 7), wh ich was generated 
from pBS120DSS by removing the 5' portion of gp4l 
including the RRE (Mlul to Hpal at nt 8200) (Fig. 9B) . 
These results demonstrate the presence of a major INS -like 
sequence within the gpi20 portion. To study the effect of 
Rev on this mRNA, different RREs (RRE330, RRB270, and 
RRED345 (Solomin et al., J. Virol. 64:6010-6017 (1990) 
were inserted into pl20pA downstream of the gpl20 stop 
codon, resulting in pl20R330, pl20R270, and pl20RD345, 
respectively (Fig. 7) . Immunoprecipitations demonstrated 
that the presence of Rev in trans and the RRE in cis could 
rescue the defect in the gpi20 expression plasmid. High 
levels of gpl20 were produced from pl20R330 (data not 
shown), pl20R270, and pl20RD345 (Fig. 9B) in the presence 
of Rev. 

Northern blot analysis (Fig. 8A) confirmed the 
30 protein data. The presence of Rev resulted in the 

accumulation of high levels of mRNA produced by pBS120DSS, 
P120R270, and pi20RD345. Low but detectable levels of RNA 
were produced from pl20DpA and pl20DR. 

35 
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2. Identification of INS elements located 
within the env mRNA regions using two 
strategies 

To identify elements that have a down regulatory 
effect in vivo, fragments of env cDNA were inserted into 
two different test expression vectors, p!9 and p37Ml-10D. 
These vectors contain a strong promoter for rapid 
detection of the gene product, such as the HIV-1 LTR in 
the presence of Tat, and an indicator gene that is 
expressed at high levels and can easily be assayed such as 
pl9 w of RSV or the mutated p37 w gene of HIV-l 
(p37Ml-10D) , neither of which contains any known INS-like 
elements. Expression vector pl9 contains the HIV-1 LTR 
promoter, the RSV pi9« matrix gene, and HIV-l sequences 
starting at Kpnl (nt 8561) including the complete 3' LTR 
15 (Schwartz, et al., J. Virol. 66:7176-7182 (1992). Upon 
transfection into HLtat cells high levels of pl9gag are 
constitutively produced and are visualized on Western 
blots. Expression vector p37Ml-10D contains the HIV-1 LTR 
promoter, the mutant p37gag (Ml- 10) , and the 3' portion of 
the virus starting at Kpnl (nt 8561) . Upon transfection 
into HLtat cells this plasmid constitutively produces 
p37 e * e that can be quantitated by the HIV-1 p24*** antigen 
capture assay. 

25 2.1 Identification of INS elements using 

the RSV aaa exp ression vector 

INS elements within the gp41 and gpl20 portions 
were identified. To this end, the vector pl9 was used and 
the following fragments (Fig. 10) were inserted: (A) nt 
7684 to 7959; (B) nt 7684 to 7884 and nt 7927 to 7959; 
this is similar to fragment A but has the region of the 
splice acceptors 7A, 7B and 7 deleted; (C) nt 7595 to 7884 
and nt 7927 to 7959, having the splice sites deleted as in 
B; (D) nt 7939 to 8066; (E) nt 7939 to 8416; (F) nt 8200 
to 8561 (Hpal-Kpnl) ; (G) nt 7266 to 7595 containing the 
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intact RRE; (H) nt 5523 to 6190, having the splice donor 
SD5 deleted. 

Fragments A, B, and D did not affect Gag 
expression, whereas fragment G (RRE) decreased gag 
expression approximately 5x. Fragment C, E, and H lowered 
Gag expression by about 10-20 -fold indicating the presence 
of INS elements. 

Interestingly, it was observed that the 
insertion of element F spanning 350 bp in plasmid pl9 
abolished production of Gag, indicating the presence of a 
strong INS within this element. The presence of the RRE 
in cis and Rev in trans resulted in production of high 
levels of RSV pi9«. Fragment F also had a smaller 
downregulatory effect on the expression of the 
INS-corrected pl7« of HXV-i (pl7Ml234) . These 
experiments revealed the presence of multiple elements 
located within the env mRNA that cause inhibition of pl9*«* 
expression. 

2.2 Elimination of the INS within 
20 fragment-. 

Six synthetic oligonucleotides (Table 3) were 
generated that introduced 103 point mutations within this 
region of 330 nt without affecting the amino acid 
composition of Env. The mutated fragment F was tested in 
pl9 to verify that the INS elements are destroyed. The 
introduction of the mutations within oligo#l only 
marginally affected the expression of pl9™, whereas the 
presence of all oligos (#1 to #6) completely inactivated 
the INS effect of fragment F. This is another example 
that more than one region within an INS element needed to 
by mutagnenized to eliminate the INS effect. 

It is noteworthy that this INS element is 
present in all the multiply spliced Rev- independent mRNAs, 
such as tat, rev and nef . Experiments were performed to 
define the function of fragment F within the class of the 
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small mRNAs by removing this fragment from the tat cDNA. 
In the context of this mRNA, this element confers only a 
weak INS effect (3-5- fold inhibition), which suggests 
that inhibition of expression in env mRNA may require the 
presence of at least two distinct elements. These results 
suggested that the INS effect within env is based on 
multiple interacting components . Alternatively, the 
relative location and interactions among multiple INS 
components may be .important for the magnitude of the INS 
effect. Therefore, more than one type of analysis in 
different vectors may be necessary for the identification 
and elimination of INS. 

2.3. Identification of INS elements using 
P37M1-10D exnreagion vector- 

The env coding region was subdivided into 
different consecutive fragments. These fragments and 
combinations of thereof were PCR-amplif ied using oligos as 
indicated in Fig. n and inserted downstream of the 
mutated p37™ gene in p37Ml-10D. The plasmids were 
transfected into HLtat cells that were harvested the next 
day and analyzed for p24« expression. Fig. n shows that 
the presence of fragments 2, 3, 5 as well as the 
combination 1+2+3 lowered gag expression substantially. 
Different oligos (Table 4) were synthesized that change 
the AT- rich domains including the three AATAAA elements 
located within the env coding region by changing the 
nucleotide but not the amino acid composition of Env. In 
a first approach, these oligos 1-19 are being introduced 
into plasmid p!20R270 with the goal or producing gpl20 in 
a Rev- independent manner. Oligonucleotides such as oligos 
20-26 will then be introduced into the gp41 portion, the 
two env portions combined and the complete gplSO expressed 
in a Rev- independent manner. 



35 



WO 93/20212 



PCT/US93/02908 



- 69 - 



10 



EXAMPLE 3 
PROTO- ONCOGENE C-FOS 
Fragments of the fos gene were inserted into the 
vector pl9 and the expression of the resulting plasmids 
were analyzed by transfections into HLtat cells. It was 
found that several fragments inhibited protein expression. 
A strong INS was identified in the fragment containing 
nucleotides 3328-3450 ("fragment [3328-3450]") 
(nucleotides of the fos gene are numbered according to 
Genebank sequence entry HUMCFOT, ACCESSION # V01512) . In 
addition, a weaker element was identified in the coding 
region. 

To eliminate these INS the following 
oligonucleotides were synthesized and are used in 
mutagenesis experiments as specified supra. 

To eliminate the INS in the fos non- coding 
region, the following oligonucleotides, which make changes 
in the fos non-coding region between nucleotides [3328- 
3450] (the letters in lower case indicate mutated 
nucleotides), were synthesized and are used to mutagenize 
fragment [3328-3450]: mutagenesis experiments as specified 
supra : 



15 



#1: 

3349-3391 

TGAAAACGTTcgcaTGTGTcgcTAcgTTgcTTAcTAAGATGGA (SEQ ID NO- 
16) 

#2: 

3392-3434 

TTCTCAGATAccTAgcTTcaTATTgccTTaTTgTCTACCTTGA (SEQ ID NO- 
17) 



35 



These oligonucleotides are used to mutagenize 
fos fragment [3328-3450] inserted into vectors pl9, 
P17M1234 or P 37M1234 and the expression of the resulting 
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plasmids are analyzed after trans feet ion into HLtat cells. 

The expression of fos is expected to be 
increased by the elimination of this INS region. 

To further define and eliminate the INS elements 
in the coding region, additional longer fragments of fos 
5 are introduced into vector p37M1234. The INS element in 
the coding region is first mapped more precisely using 
this expression vector and is then corrected using the 
following oligonucleotides: 

10 #1 

2721-2770 

GCCCTGTGAGtaGGCActGAAGGacAGcCAtaCGtaACatACAAGTGCCA (SEQ ID 
NO: 18) 

15 #2 

2670-2720 

AGCAGCAGCAATGAaCCTagtagcGAtagcCTgAGtagcCCtACGCTGCTG (SEQ 
ID NO: 19) 

20 # 3 

2620-2669 

ACCCCGAGGCaGAtagCTTtCCatCCTGcGCtGCcGCtCACCGCAAGGGC (SEQ ID 
NO: 20) 

25 #4 

2502-2562 

CTGCACAGTGGaagCCTcGGaATGGGcCCtATGGCtACcGAatTGGAaCCaCTGTGCA 
CTC (SEQ ID NO: 21) 

30 The expression of fos is expected to be 

increased by the elimination of this INS region. 
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EXAMPLE 4 
HIV-1 POL GENE 
Vector p37Ml234 was used to eliminate an 
inhibitory/instability sequence from the pol gene of HIV-l 
which had been characterized by AW Cochrane et al., 
"Identification and characterization of intragenic 
sequences which repress human immunodeficiency virus 
structural gene expression", J. Virol. 65:5305-5313 
(1991) . These investigators suggested that a region in 
pol (HIV nucleotides 3792-4052), termed CRS, was important 
for inhibition. A larger fragment spanning this region, 
which contained nucleotides 3700-4194, was inserted into 
the vector p37M1234 and its effects on the expression of 
p37gag from the resulting plasmid (plasmid p37M1234RCRS) 
(see Pig. 12 ) was analyzed after transfection into HLtat 
15 cells. 

Severe inhibition of gag expression (10 fold, 
see Pig. 13) was observed. 

In an effort to eliminate this INS, the 
following oligonucleotides were synthesized (the letters 
in lower case indicated mutated nucleotides) and used in 
mutagenesis experiments. 

First, it was observed that one AUUUA potential* 
instability element was within the INS region. This was 
eliminated by mutagenesis using oligonucleotide MIOpol and 
resulted in plasmid p37M1234RCRSP10 . The expression of 
gag from this plasmid was not improved, demonstrating that 
elimination of the AUUUA element alone did not eliminate 
the INS. See Fig. 12. Therefore, additional mutagenesis 
was performed and it was shown that a combination of 
mutations introduced in plasmid p37M1234RCRS was necessary 
and sufficient to produce high' levels of gag proteins, 
which were similar to the plasmid lacking CRS. The 
mutations necessary for the elimination of the INS are 
shown in Fig. 13. 

The above results demonstrate that HIV-l pol 
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contains INS elements that can be detected and eliminated 
with the techniques described. 

These results also suggest that regions outside 
of the minimal inhibitory region in CRS as defined by A.W. 
Cochrane et al., supra , influence the levels of 
expression. These results suggest that the RNA structure 
of the region is important for the inhibition of 
expression. 

Table 1 

Correspondence between Sequence 
Identificatio n Numbers and Nucleotides in Fioiirp A 

Sequence ID Nos. Figure 4 

SEQ ID NO:l nucleotides 336-731 

SEQ ID NO: 2 nucleotides 402-452 

SEQ ID N0:3 nucleotides 536-583, above line 

15 SE Q m N0:4 nucleotides 585-634, above line 

SEQ ID NO: 5 nucleotides 654-703, above line 

SEQ ID NO: 6 nucleotides 402-452, below line (Ml) 

SEQ ID NO: 7 nucleotides 536-583, below line (M2) 

SEQ ID NO: 8 nucleotides 585-634, below line (M3) 

SEQ ID NO:9 nucleotides 654-703, below line (M4) 

20 Table 2 

Synthetic oligonucleotides used 
in the mutagenesis of HIV-l gag and pol regions 

The upper sequence is the wild- type HIV-l as 
found in HIVh^ while the bottom is the mutant 
25 oligonucleotide sequence. The location of the sequence is 
indicated in parentheses. 



M5gag (77 8-82 4) 

CACCTAGAACTTTAAATGCATGGGTA (SEQ ID 

NO: 22) 

30 XXXXX XXX 

CACCTAGAACccTgAAcGCcTGGGTgAAgGTgGTAGAAGAGAAGGCT (SEQ ID 
NO? 2^ 
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M6gag (871-915) 

C^CCCCAC^GATITAAA^ (SEQ ID N0; 

X XX X XXX X 

CCACCCCACAgGAccTgAACACgATGtTgAACACcGTGGGGGGAC (SEQ ID NO: 

/ 



M7gag (1105-1139) 

CAGTAGGAGAAATTTATAAAAGATGGATAATCCTG (SEQ ID NO- 26) 
X X X X X " 

CAGTAGGAGAgATcTAcAAGAGgTGGATAATCCTG (SEQ ID NO: 27) 
M8gag (1140-1175) 

GGATTAAATAAAATAGTAAGAATGTATAGCCCTACC (SEQ ID NO: 28) 
X X X X X X 

GGATTgAAcAAgATcGTgAGgATGTATAGCCCTACC (SEQ ID NO: 29) 
M9gag (1228-1268) 

ACCGGTTCTATAAAACTCTAAGAGCCGAGCAAGCTTCACAG (SEQ ID NO: 30) 
15 XXX XX X X 

ACCGGTTCTAcAAgACcCTgcGgGCtGAGCAAGCTTCACAG (SEQ ID NO: 31) 
MIOgag (1321 -136 4) 

AITCTAAGACTATTTTAAAAGCATTGGGACCAGCGGCTACACTA (SEQ ID NO: 

) 

X XX X X XX X X 
ATTGTAAGACcATcCTgAAgGCtcTcGGcCCAGCGGCTACACTA (SEQ ID NO: 



Mllgag (1416-1466) 

^t^^ CTC ^ G ^ TCAGCC ^ T ^ C ^ TOC AGCTACCATAATC (SEQ 
ID NO: 34) 

25 XXX XXXXXX 

ro^NO^Ts^ 0 ^^^^^*^^^^^^^^ 0 ^^^^^^^^ 1 ^ (SEQ 

M12gag (1470-1520) 

^^^^ ITOAGG ^ CC ^ G ^^ TOTO ^ GTC 'I^CAATTGT (SEQ 
<j li* NO : 36) 

X XX XX X X X 

^0^7^ CTOCC( ^ C ^ 9C ^^^ TCGTCAAGTCm ^ ,rTCT (SEQ 
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M13gag (1527-1574) 

GAAGGGCACAC^GCCAGAAATTGCAGGG ( S EO ID 

NO: 38) 

XXX XX X 

GAAGGGCAC^CcGCCAGgAAcTGCcGGGCCCCccGGAAgAAGGGCTGT (SEQ ID 
NO: 39) 



M14gag (1581-1631) 

TGTGGAAAGGAAGGACACCAAATGAAAG^ (SEO 
ID NO: 40) 

XX X XXXX XX 

^^•^^AgGGgCACCA.gATGAAgGAcTGcA.CgGAGcGgCAGGCTAAT (SEQ 

MOpol (1823-1879) (K to R difference introduced) 

CCCCTCGTCACAATAAAGATAGGGGGGCAACTAAAGGAAGCrCTATTAGATACAGGAG 
(SEQ ID NO: 42) 

x xx X X XX X 

CCCCTCGTCACAgTAAgGATcGGGGGGCAACTcAAGGAAGCgCTgcTcGATACAGGAG 
(SEQ ID NO: 43) 



Mlpol (1936-1987) 

GATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAGATACTC (SEO 
ID NO: 44) 

XXXXX XXX XX 

GATAGGGGGgATcGGgGGcTTcATCAAgGTgAGgCAGTAcGAcCAGATACTC (SEQ 
ID NO: 45) 



M2pol (2105-2152) 

CCTATTGAGACTGTACCAGTAAAATTAAAGCCAGGAATGGATGGCCCA (SEQ ID 
NO: 46) 

XXXXXX XX 

CCTATTGAGACgGTgCCcGTgAAgTTgAAGCCgGGgATGGATGGCCCA (SEQ ID 
NO: 471 



25 NO: 47) 

M3.2pol (2162-2216) 

CAATGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAAATTTGTACAGAGA 
(SEQ ID NO: 48) 

X XXXXX X X 

CAATGGCCATTGACgGAAGAgAAgATcAAgGCcTTAGTcGAAATcTGTACAGAGA 
(SEQ ID NO: 49) 
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M4pol (2465-2515) 

TTCAGGAAGT ATACTGCATTTAC CATAC CTAGTATAAACAATGAGACAC CA ( SEQ 
ID NO: 50) v 

XXXX XXXX X 

TTCAGGAAGTAcACgGCgTTcACCATcCCgAGcATcAACAAcGAGACACCA ( SEQ 
ID NO: 51) 



M5pol (2873-2921) 

TOAGTCGGGAAATTGAATTGGGCAAGTCAGATTTACCCAGGGATTAAAG (SEQ ID 

XX X XX X X 

TTAGTGGGGAAggTGAAcTGGGCgAGcCAGATcTACCCgGGGATTAAAG (SEQ ID 
NO: 53) 



M6pol (3098-3150) 

GGCCAATGGaCATATCAAATTTATCAAG^ {SEQ 
ID NO: 54) 

XXXXXX XXXX 

S C NO^ T 55? ^ CgTAC ^ gATcTAC ^^ GCC 9 TOcAA 9 AA cCTGAAAACAGG ( SEQ 



M7pol (3242-3290) 

TCGGGAAAGACTCCTAAATTTAAACTGCCCATACAAAAGGAAACATGGG (SEQ ID 

rJU i do) 

xxxxx XX X 

TGGGGAAAGACgCCgAAgTTcAAgCTGCCCATcCAgAAGGAgACATGGG (SEQ ID 
M8pol (3520-3569) 

GAAGA^AGTTACAAGC^TTTATCTAGCTTTGCAGGATTCGGGATTAG (SEQ ID 

XXXXXXXXX X 
G^GACTCAGcTgCAgGCgATcTAcCTgGCgcTGCAGGAcTCGGGATTAG (SEQ ID 



M8.2pol (3643-3698) 

GTTAGTCAATCAAATAATAGAGCAGTTAATAAAAAAGGAAAAGGTCTATCTGGCAT 
(SEQ ID NO: 60) 

x XX XXXX X X 

30 ^ AG ^= C ^T cATcG ^ CA GcTgATcAAgAAGGAgAAGGTgTATCTGGCAT 
->« taEQ ID NO: 61) 
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M9pol (3749-3800) 

GTCAGTGCTGGAATCAGGAAAGTACTATTTTTAGATGGAATAGATAAGGCCC (SEQ 
ID NO: 62) 

XX XX XXXXXX 

GTCAGTGCTGGgATCcGGAAgGTgCTATTccTgGAcGGgATcGATAAGGCCC ( SEQ 
ID NO: 63) 



M9.2pol (3806-3863) 

GAACATGAGAAATATCACAGTAATTGGAGAGC^ 
(SEQ ID NO: 64) 

XXXXXXXXX XXXX 
GAACATGAGAAgTAcCAC t C CAAcTGGc GcG C t ATGGC cAGcGAcTTcAAC CTG C CAC 
(SEQ ID NO: 65) 

MIOpol (3950-4001) 

GGAATATGGCAACTMATTGTACACATTTAGAAGG (SEQ 
ID NO: 66) 

XXXXXXXXXXXX 
GGAATATGGCAgCTgGAcTGcACgCAccTgGAgGGgAAgGTgATCCTGGTAG (SEQ 



ID NO: 67) 

15 



Mllpol (4031-4096) 

GCAGAAGTTATTCCAGCAGAAACAGGGCAGGAAACAGCATAlTr^ 
-CAGGAAGA (SEQ ID NO: 68) 

x x x x xxxxxxxxxx 

GCAGAAGTTATcCCtGCtGAAACtGGGCAGGAgACcGCcTAcTTcCTgcTcAAAcTcG 
-CAGGAAGA (SEQ ID NO; 69) 



M12pol (4097-4151) 

TGGCCAGTAAAAACAATACATACTGACAATGGCAGCAATOT 
(SEQ ID NO: 70) 

XXXXXX XX X X 

TGGCCAGTgAAgACgATcCAcACgGACAAcGGaAGCAAcTTCACtGGTCCTACGG 
25 (SEQ ID NO: 71) 

M13pol (4220-4271) 

GGAGTAGTAGAATCTATGAATAAAGAATTAAAGAAAATTATAGGACAGGTAA (SEQ 
ID NO: 72) 

X i XX XX XXX 

GGAGTAGTAGAATCcATGAAcAAgGAAcTgAAGAAgATcATcGGACAGGTAA (SEQ 
JU ID NO: 73) 



M12pol-p (4097-4151) (indicates the sequence found in 
p3 7M1234RCRSP10+P12p 

TGGCCAGTAAAAACAATACAcACgGACAAcGGaAGCAAcTTCACtGGTGCTACGG 
35 (SEQ ID NO: 74) 
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Sequences of mutant oligos designed 
to eliminate th* TW< j effect of fragment- v 

The six oligonucleotides used to eliminate the 
INS effect of fragment F (oligos #1 to #6) are set forth 
above in Example 2 (SEQ. ID. NOS. 10-15). 



Table 4 

Sequence of mutant oligos designed to 
destroy INS elements within t-h e env P nrH nq reaion 



The wildtype (top) and the mutant oligo (below) 
of 26 different regions are shown, 
mutant oliaos for »nv of ttt-w-t . 
15 Ml (5834-5878) 46-mer 

CTimSATGTTGATGATCTGTAGTGCTACAGAAAAATTGTGGGTC (SEQ ID NO: 

_ x xxxxxxxx 

CTTGGGATGcTGATGATcTGcAGcGCcACcGAgAAgcTGTGGGTC (SEQ ID NO: 
76) 

20 M2 (5886-5908) 24-mer 

ATTATGGGGTACCTGTGTGGAAG (SEQ ID NO- 77) 
XXX 

ATTATGGcGTgCCcGTGTGGAAG (SEQ ID NO: 78) 
M3 (592 0-59 56) 38-mer 

CACTCTATTTTGTGCATCAGATGCTAAAGCATATGAT (SEQ ID NO- 79) 
25 X X X X X X X 

CACTCTATTcTGcGCcTCcGAcGCcAAgGCATATGAT (SEQ ID NO: 80) 

M4 (5957-5982) 27-mer 

ACAGAGGTACATAATGTTTGGG CCAC (SEQ ID NO: 81) 

xxxx 

ACAGAGGTgCAcAAcGTcTGGGCCAC (SEQ ID NO: 82) 
30 M5 (6006-6057) 53-mer 

S^^f**^^ (SEQ 

XXXXXX XX xxxx 
^0? C 84t 9GA9GT9GT9C ^ T9 ^ CGTCACC ^ 9 ^ CTO ^^ TCTC (SEQ 

35 
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M6 (6135-6179) 46-mer 

TAACCCCACTCTGTG^ (SEQ m N0 . 

85) 

X X X XX X X XX 

TAACCCCcCTCTGcGTgAGccTgAAGTGCACcGAccTGAAGAATG (SEQ ID NO: 
86) 

5 M7 (6251-6280) 31-mer 

ATCAGCACAAGCATAAGAGGTAAGGTGCAG (SEQ ID NO: 87) 

X XX X X 

ATCAGCAC cAGCATc cGcGG CAAGGTGCAG (SEQ ID NO: 88) 

M8 (6284- 6316) 34-mer 

GAATATGCATTTTTTTATAAACTTGATATAATA (SEQ ID NO- 89) 
in X X X X X X 

W GAATATGCcTTcTTcTAcAAgCTgGATATAATA (SEQ ID NO: 90) 
M9 (6317-6343) (28-mer) 

CCAATAGATAATGATACTACCAGCTAT (SEQ ID NO* 91) 

X XXX 
CCAATAGcTAAgGAcACcACCAGCTAT (SEQ ID NO: 92) 

15 M10 (6425-6 469) (46-mer) 

GCCCCGGCTGGTTTTGCGATT^ (SEQ ID NO: 

93) 

xxx xxxxxx 

GCCCCGGCcGGcTTcGCGATcCTgAAgTGcAAcAAcAAQACGTTC (SEQ ID NO: 



Mil (6542-6583) (42-nter) 
20 CAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGGTAGTA (SEQ ID NO: 95) 

xxx xxxxx 

CAACTGCTGcTgAAcGGCAGcCTgGCcGAgGAgGAGGTAGTA (SEQ ID NO: 96) 
M12 (6590-6624) (35-mer) 

TCTGTCAATTTCACGGACAATGCTAAAACCATAAT (SEQ ID NO- 97) 
X X X X x 
25 TCTGCCAAcTTCACcGACAAcGCcAAgACCATAAT (SEQ ID NO: 98) 

M13 (6632-6663) (32-mer) 

CTGAACACATCTGTAGAAATTAATTGTACAAG (SEQ ID NO- 99) 

xxxxxx 

CTGAACCAgTCcGTgGAgATcAAcTGTACAAG (SEQ ID NO: 100) 

M14 (6667-6697) (31-mer) 
30 CAACAACAATACAAGAAAAAGAATCCGTATC (SEQ ID NO- 101) 
X X X XX X 
CAACAACAAcACcGGcAAgcGcATCCGTATC (SEQ ID NO: 102) 
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M15 (6806-6852) (47-mer) 

GCTAGCAAATTAAGAGAACA^^ (SEQ ID 

NO • X 0 3 ) 

xxxxxxxxxxxxx 

^^^ScTgcGcGAgCAgTAcGGgAAcAAcAAgACcATAATCTT (SEQ ID 

5 M16 (nt 6917-6961) (45-merJ 

^CTACTGTAATTCAACAC^ACTGTTTAATAGTACTTGGTTTAAT (SEQ ID NO: 

xxxxx xxxx 

TOCTACTGgAAcTCcACcCAgCTGTTcAAcAGcACcTGGTTTAAT (SEQ ID NO: 

106 ) 

ML 7 (nt 7006-7048) (43-mer) 

^^ T ^ CCCTCC ^ TC ^ AA TAAAACAAATTATARACATO (SEQ ID NO: 

xxx xxxxxx 

CACAATCACcCTgCCcTGCcGcATcAAgCAgATcATAAACATG (SEQ ID NO: 

M18 (nt 7084-7129) (46-mer) 
15 ^TCAGTGGACAAATTAGATGrrCATCAAATATTAO^CTGC^^ (SEQ ID NO: 

xxxxxxxxxxx 

CATCAGCGGcCAgATccGcTGcTCcTCcAAcATcACcGGGCTGCTA (SEQ ID NO: 

M19 (nt 7195 -7252) (58-mer) 
20 ^f^J^^ 



x xxxxxxxxxxxxx 

M20 (nt 7594-7633) (40-mer) 

GCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGAACAG (SEQ ID NO- 113) 

25 xxxxxxx 

GCCTTGGAAcGCcAGcTGGAGcAAcAAgTCcCTGGAACAG (SEQ ID NO: 114) 
M21 (nt 7658-7689) (32-mer) 

GAGTGGGACAGAGAAATTAACAATTACACAAG (SEQ ID NO: 115) 
GAGTGGGACcGcGAgATcAACAAcTACACAAG (SEQ ID NO: 116) 

30 M22 (nt 7694-7741) (48-mer) 

ATAC^CTCCTTAATTG&AGAATra (SEQ ID 

xxxxxxx xxx 

ATACACTCCcTgATcGAgGAgTCcCAgAACCAgCAgGAgAAGJ^TCWV (SEQ ID 
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M23 (nt 7954-7993) (40-mer) 

CAGGCCCGAAGGAATAGAAGAAGAAGGTGGAGAGAGAGAC (SEQ ID NO- 119) 

XXXXXXXX 
CAGGCCCGAgGGcATcGAgGAgGAgGGcGGcGAGAGAGAC {SEQ ID NO: 120) 

M24 (nt 8072-8121) (50-mer) 

TACCACCGCTTGAGAGACITACTCITGATTC (SEQ ID 

5 NO: 121) V 

XXX XX X- XX X 
TACCacCGCcTGcGcG^ (SE q 

M25 (nt 8136-8179) (44-mer) 

GGTGGGAAGCCCTCAAATATTGGTGGAATCTCCTACAGTATTGG (SEQ ID NO- 
123) 

10 X XX XX 

GGTGGGAgGCCCTCZAAgTAcTGGTGGAAcCTCCTcCAGTATTGG (SEQ ID NO: 
124) 

M26 (nt 8180-8219) (40-mer) 

AGTCAGGAACTAAAGAATAGTGCTGTTAGCTTGCTCAATG (SEQ ID NO: 125) 

x x xxxxxx 

15 AGTCAGGAgCTgAAGAAc AG cGC cGTgAaC cTGCTCAATG (SEQ ID NO: 126) 



Conpefits: 

Although the vast majority of oligonucleotides 
follow the HXB2 sequence, some exceptions are noted: 

In oligo M15, nt 6807 follows the pNL43 
sequence. (Specifically, nt 6807 is C in NL43 but A in 
HBX2.) Oligo M26 has the nucleotide sequence derived from 
pNL43. 

EXAMPLE 5 

USE OF OR P37M1-10D OR P55M1-13P0 IN 
IMMUNOPROPHYLAXIS OR IMMUNOTHERAPY 

In postnatal gene therapy, new genetic 
information has been introduced into tissues by indirect 
means such as removing target cells from the body, 
infecting them with viral vectors carrying the new genetic 
information, and then reimplanting them into the body; or 
by direct means such as encapsulating formulations of DNA 
in liposomes; entrapping DNA in proteoliposomes containing 
viral envelope receptor proteins; calcium phosphate co- 
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precipitating DNA; and coupling DNA to a polylysine- 
glycoprotein carrier complex. In addition, iij vivo 
infectivity of cloned viral DNA sequences after direct 
intrahepatic injection with or without formation of 
calcium phosphate coprecipitates has also been described. 
mRNA sequences containing elements that enhance stability 
have also been shown to be efficiently translated in 
Xenopug laevig embryos, with the use of cationic lipid 
vesicles. Sgz, e.g., J .A. Wolff, et al. # Science 
247:1465-1468 (1990) and references cited therein. 

Recently, it has also been shown that injection 
of pure RNA or DNA directly into skeletal muscle results 
in significant expression of genes within the muscle 
cells. J. A. Wolff, et al., Science 247:1465-1468 (1990). 
Forcing RNA or DNA introduced into muscle cells by other 
15 means such as by particle-acceleration (N. -S. Yang, et 
al. Proc. Natl. Acad, gel, flpft 87:9568-9572 (1990); S.R. 
Williams et al., Proc. Natl . Acad, T nsa 88:2726-2730 
(1991) ) or by viral transduction should also allow the DNA 
or RNA to be stably maintained and expressed. In the 
experiments reported in Wolff et al., RNA or DNA vectors 
were used to express reporter genes in mouse skeletal 
muscle cells, specifically cells of the quadriceps 
muscles. Protein expression was readily detected and no 
special delivery system was required for these effects. 
Polynucleotide expression was also obtained when the 
composition and volume of the injection fluid and. the 
method of injection were modified from the described 
protocol. For example, reporter enzyme activity was 
reported to have been observed with 10 to 100 /il of 
hypotonic, isotonic, and hypertonic sucrose solutions, 
Opti-MEM, or sucrose solutions containing 2mM CaCl 2 and 
also to have been observed when the 10- to 100- /xl 
injections were performed over 20 min. with a pump instead 
of within 1 min. 

Enzymatic activity from the protein encoded by 
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the reporter gene was also detected in abdominal muscle 
injected with the RNA or DNA vectors, indicating that 
other muscles can take up and express polynucleotides. 
Low amounts of reporter enzyme were also detected in other 
tissues (liver, spleen, skin, lung, brain, and blood) 
injected with the RNA and DNA vectors. Intramuscularly 
injected plasmid DNA. has also been demonstrated to be 
stably expressed in non-human primate muscle. S. Jiao et 
Hum. Gene Therapy (1992). 

It has been proposed that the direct transfer of 
genes into human muscle is situ may have several potential 
clinical applications. Muscle is potentially a suitable 
tissue for the heterologous expression of a transgene that 
would modify disease states in which muscle is not 
primarily involved, in addition to those in which it is. 
For example, muscle tissue could be used for the 
heterologous expression of proteins that can immunize, be 
secreted in the blood, or clear a circulating toxic 
metabolite. The use of RNA and a tissue that can be 
repetitively accessed might be useful for a reversible 
type of gene transfer, administered much like conventional 
pharmaceutical treatments. See J. A. Wolff, et al., 
Science 247:1465-1468 (1990) and S. Jiao et al., Hum. Gene 
Therapy 3:21-33 (1992). 

It had been proposed by J. A. Wolff et al., 
su^ra, that the intracellular expression of genes encoding 
antigens might provide alternative approaches to vaccine 
development. This hypothesis has been supported by a 
recent report that plasmid DNA encoding influenza A 
nucleoprotein injected into the quadriceps of BALB/c mice 
resulted in the generation of influenza A nucleoprotein- 
specific cytotoxic T lymphocytes (CTLs) and protection 
from a subsequent challenge with a heterologous strain of 
influenza A virus, as measured by decreased viral lung 
titers, inhibition of mass loss, and increased survival. 
J. B. Ulmer et al., Science 259:1745-1749 (1993). 
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Therefore, it appears that the direct injection 
of RNA or DNA vectors encoding the viral antigen can be 
used for endogenous expression of the antigen to generate 
the viral antigen for presentation to the immune system 
without the need for self -replicating agents or adjuvants, 
resulting in the generation of antigen- specif ic CTLs and 
protection from a subsequent challenge with a homologous 
or heterologous strain of virus. 

CTLs in both mice and humans are capable of 
recognizing epitopes derived from conserved internal viral 
proteins and are thought to be important in the immune 
response against viruses. By recognition of epitopes from 
conserved viral proteins, CTLs may provide cross -strain 
protection. CTLs specific for conserved viral antigens 
can respond to different strains of virus, in contrast to 
15 antibodies, which are generally strain- specif ic. 

Thus, direct injection of RNA or DNA encoding 
the viral antigen has the advantage of being without some 
of the limitations of direct peptide delivery or viral 
vectors. Sag J.A. Ulmer et al., supra , and the 
20 discussions and references therein) . Furthermore, the 

generation of high- titer antibodies to expressed proteins 
after injection of DNA indicates that this may be a facile 
and effective means of making antibody-based vaccines 
targeted towards conserved or non- conserved antigens, 
25 either separately or in combination with CTL vaccines 

targeted towards conserved antigens. These may also be 
used with traditional peptide vaccines, for the generation 
of combination vaccines. Furthermore, because protein 
expression is maintained after DNA injection, the 
persistence of B and T cell memory may be enhanced, 
thereby engendering long-lived humoral and cell -mediated 
immunity. 
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1. Vectors for the immunoprophylaxis or 

immunot herapy against htv-i 

The mutated gag genomic sequences in vectors 
P37M1-10D or P 55M1-13P0 (Fig. 6) will be inserted in 
expression vectors using a Strong constitutive promoter 
such as CMV or RSV, or an inducible promoter such as 
HIV-1. 

The vector will be introduced into animals or 
humans in a pharmaceutically acceptable carrier using one 
of several techniques such as injection of DNA directly 
into human tissues; electroporation or transfection of the 
DNA into primary human cells in culture (ex vivo) , 
selection of cells for desired properties and 
reintroduction of such cells into the body, (said 
selection can be for the successful homologous 
recombination of the incoming DNA. to an appropriate 
preselected genomic region) ; generation of infectious 
particles containing the gag gene, infection of cells ejc 
2dva and reintroduction of such cells into the body; or 
direct infection by said particles in vivo. 

Substantial levels of protein will be produced 
leading to an efficient stimulation of the immune system. 

In another embodiment of the invention, the 
described constructs will be modified to express mutated 
gag proteins that are unable to participate in virus 
particle formation, it is expected that such gag proteins 
will stimulate the immune system to the same extent as the 
wild- type gag protein, but be unable to contribute to 
increased HIV-l production. This modification should 
result in safer vectors for immunotherapy and 
immunophrophylaxis . 
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EXAMPLE 6 

INHIBITION OF HIV-1 EXPRESSION USING TRANSDOMINANT 
(TD)- TD-GAG-TD REV OR TD GAG- PRO -TD REV GENES 

5 Direct injection of DNA or use of vectors other 

than retroviral vectors will allow the constitutive high 
level of trans -dominant gag (TDgag) in cells. In 
addition, the approach taken by B.K. Felber et al., 
Science 239:184-187 (1988) will allow the generation of 

1Q retroviral vectors, e.g. mouse-derived retroviral vectors, 
encoding HIV-l TDgag, which will not interfere with the 
infection of human cells by the retroviral vectors. In 
the approach of Felber, et al.. - supra , it was shown that 
fragments of the HIV-l LTR containing the promoter and 

15 part of the polyA signal can be incorporated without 

detrimental effects within mouse retroviral vectors and 
remain transcriptionally silent. The presence of Tat 
protein stimulated transcription from the HIV-l LTR and 
resulted in the high level expression of genes linked to 
the HIV-l LTR. 

20 

The generation of hybrid TDgag - TDRev or TDgag - 
pro-TDRev genes and the introduction of expression vectors 
in human cells will allow the efficient production of two 
proteins that will inhibit HIV-l expression. The 

25 incorporation of two TD proteins in the same vector is 
expected to amplify the effects of each one on viral 
replication. The use of the HIV-l promoter in a matter 
similar to one described in B.K. Felber, et al., supra , 
will allow high level gag and rev expression in infected 

3Q cells. In the absence of infection, expression will be 
substantially lower. Alternatively, the use of other 
strong promoters will allow the constitutive expression of 
such proteins. This approach could be highly beneficial, 
because of the production of a highly immunogenic gag, 

35 which is not able to participate in the production of 
infectious virus, but which, in fact, antagonizes such 
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production. This can be used as an efficient 
immuniprophylactic or immunotherapeutic approach against 
AIDS. 

Examples of trans -dominant mutants are described 
in Trono et al.. Cell 59:112-120 (1989). 

1. Generation of constructs encoding 
trans dominant gag mutant proteins 

Gag mutant proteins that can act as trans - 
dominant mutants, as described, for exair$>le, in Trono et 
al., p^pi^a, will be generated by modifying vector 
P37M1-10D or p55Ml-13P0 to produce transdominant gag 
proteins at high constitutive levels. 

The transdominant gag protein will stimulate the 
immune system and will inhibit the production of 
15 infectious virus, but will not contribute to the 
production of infectious virus. 

The added safety of this approach makes it more 
acceptable for human application. 

20 Those skilled in the art will recognize that any 

gene encoding a mRHA containing an inhibitory/instability 
sequence or sequences can be modified in accordance with 
the exemplified methods of this invention or their 
functional equivalents. 

25 Modifications of the above described modes for 

carrying out the invention that are obvious to those of 
skill in the fields of genetic engineering, protein 
chemistry, medicine, and related fields are intended to be 
within the scope of the following claims. 

30 Every reference cited hereinbefore is hereby 

incorporated by reference in its entirety. 
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WHAT IS CLAIMED IS : 

1. A method for reducing the effect of 
inhibitory/instability sequences within the coding region 
of a mRNA, said method comprising the steps of: 

(a) providing a gene which encodes said 
5 mRNA; 

(b) identifying the inhibitory/instability 
sequences within said gene which 
encode said inhibitory/instability 
sequences within the coding region of 

10 said mRNA; 

(c) mutating said inhibitory/ instability 
sequences within said gene by making 
multiple point mutations; 

(d) transfecting said mutated gene into a 
15 cell; 

(e) culturing said cell in a manner to 
cause expression of said mutated gene; 

(f) detecting the level of expression of 
said gene to determine whether the 

20 effect of said inhibitory/instability 

sequences within the coding region of 
the mRNA has been reduced. 

2. The method of Claim 1 further comprising 
25 the step of fusing said mutated gene to a reporter gene 

prior to said transfecting step and said detecting step is 
performed by detecting the level of expression of said 
reporter gene. 

30 3. The method of Claim 1 wherein step (b) 

further comprises the steps of 

(a) fusing said gene or fragments of said 
gene to a reporter gene to create a 
fused gene; 

35 (b) transfecting said fused gene into a 
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cell ; 

culturing said cell in a manner to 
cause expression of said fused gene; 
detecting the level of expression of 
said fused gene to determine whether 
the expression of said fused gene is 
reduced relative to the expression of 
said reporter gene. 



4. 



The method of Claim 3 wherein step (a) 
comprises fusing said gene or fragments of said gene 3' to 
the stop codon of said reporter gene. 

5- The method of Claim 3 wherein step (a) 
comprises fusing said gene or fragments of said gene in 
frame with the 3' end of the coding region of said 
reporter gene. 

6- The method of Claim 1 or 2 wherein said 
mutating step changes the codons such that the amino acid 
sequence encoded by the mSNA is unchanged. 

7- The method of Claim 6 wherein said 
inhibitory/instability sequences are AT-rich and wherein 
saxd mutating step comprises substituting either G or C 
for either A or T and wherein the final nucleotide 
composition of said mutated inhibitory sequence is about 
50* A and T and about 50% g and c. 

8. The method of Claim 6 wherein at least 75% 
of the point mutations replace conserved nucleotides with 
non- conserved nucleotides . 

9- The method of Claim 6 wherein said mutating 
step comprises substituting less preferred codons with 
more preferred codons. 
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10. The method of Claim 1 or 2 wherein said 
mRNA encodes the GAG protein of a Rev- dependent comple: 
retrovirus . 

11. The method of Claim 10 wherein the Rev- 
dependent complex retrovirus is human immunodeficiency 
virus- 1. 



12. A method of increasing the production of a 
polypeptide, wherein said polypeptide is encoded by a mRNA 
that contains one or more inhibitory/instability 
sequences, said method comprising the steps of: 

(a) providing a gene which encodes said 
mRNA; 

(b) identifying the inhibitory/ instability 
sequences within said gene which 
encode said inhibitory/ instability 
sequences within the coding region of 
said mRNA; 

(c) mutating said inhibitory/instability 
sequences within said gene by making 
multiple point mutations; 

(d) transfecting said mutated gene into a 
cell; 

(e) culturing said cell in a manner to 
cause expression of said mutated gene; 

(f ) detecting the level of expression of 
said gene to determine that the effect 
of said inhibitory/instability 
sequences within the coding region of 
the mRNA has been reduced; 

(g) providing a host cell transfected with 
an expression vector containing said 
mutated gene; 

(h) culturing said host cell to cause 
expression of said polypeptide; and 
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(i) recovering said polypeptide. 

13. A method of producing polypeptides, whose 
native production is impeded by the presence of an 
inhibitory/ instability sequence, comprising the steps of: 

(a) providing a host cell transfected with 
an expression vector containing a gene 
encoding said polypeptide, said gene 
having been mutated to decrease the 
effect of the inhibitory/instability 
sequence; 

(b) culturing said host cell to cause 
expression of said polypeptide; and 

(c) recovering said polypeptide. 

14. The method of Claim 13 wherein said host 
cell is prokaryotic. 



15. The method of Claim 13 wherein said host 
cell is eukaryotic. 

20 

16. The method of Claims 13, 14 or 15 wherein 
said gene is a cDNA. 

17. The method of Claims 13, 14 or 15 wherein 
25 said gene is genomic. 

18. An artificial nucleic acid construct 
comprising a gene wherein the expression of the native 
gene is impeded by the presence of inhibitory/instability 

30 sequences in the mRNA encoded by said native gene, said 
gene having being mutated to decrease the effect of the 
inhibitory/ instability sequence. 



19 * The construct of Claim 18 wherein the amino 
acid sequence encoded by said mutated gene is the same as 



WO 93/20212 



PCT/US93/02908 



- 91 - 

the amino acid sequence encoded by the native gene. 

20. The construct of Claim 19 wherein said 
native gene is HIV-l gag. 

21. The construct of Claim 20 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583,-585 and 634, and 654 and 703. 

22. The construct of claim 19 wherein said 
native gene is HIV-l env. 



23. An assay kit for identifying 
inhibitory/ instability sequences in a mRNA, comprising: 

15 < a > the nucleic acid construct of Claim 20 

or 21; and 

(b) a detection system for detecting the 
level of expression of said gene in 
said nucleic acid construct. 

20 

24. The kit of Claim 23 wherein said detection 
system is an ELISA. 



25 



30 



35 



25. An artificial nucleic acid construct 
comprising a gene mutated by the method of Claim 1 or 2, 

26. A vector comprising the nucleic acid 
construct of Claim 25. 

27. A transformed host cell comprising the 
artificial nucleic acid construct of Claim 25. 

28. A vector comprising the nucleic acid 
construct of Claim 18 or 19. 
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29 . A transformed host cell comprising the 
artificial nucleic acid construct of Claim 18 or 19. 

30. A transformed host cell of Claim 29 wherein 
said cell is selected from the group consisting of 
eukaryotes and prokaryotes . 

31. The host cell of Claim 30 wherein said cell 
is a human cell. 

32. The host cell of Claim 30 wherein said cell 
is a Chinese Hamster Ovary cell. 



33. The host cell of Claim 30 wherein said cell 
is E. coli. 

34. The construct of Claim 20 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, 654 and 703, 871 and 915, 1105 
and 1139, 1140 and 1175 and 1321 and 1364. 

35. The construct of Claim 34 wherein said HIV- 
1 gag gene is p37Ml-lOD. 

36. The construct of Claim 20 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, 654 and 703, 871 and 915, 1105 
and 1139, 1140 and 1175, 1321 and 1364, 1416 and 1466, 
1470 and 1520, 1527 and 1574, and 1823 and 1879. 

37. The construct of Claim 36 wherein said HIV- 
1 gag gene is p55Ml-13P0. 



15 



20 
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. A vaccine composition for inducing immunity 
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in a mammal against HIV infection comprising a 
pharmaceutical^ acceptable medium and further comprising 
a therapeutically effective amount of a nucleic acid 
construct capable of producing HIV gag protein in the 
absence of any HIV regulatory protein in a cell in vivo . 

39. A vaccine composition according to claim 38 
wherein said mammal is a human. 

40. A vaccine composition according to claim 38 
wherein said regulatory protein is HIV-l Rev. 



41. A vaccine composition according to claim 38 
wherein said construct is selected from the group 
consisting of the construct of claim 20, 21, 34, 35, 36, 

15 and 37. 

42. A method for inducing immunity against HIV 
infection in a mammal which comprises administering to a 
mammal a therapeutically effective amount of a vaccine 

20 composition comprising a nucleic acid construct capable of 
producing HIV gag protein in the absence of any HIV 
regulatory protein in a cell in vivo . 

43. A method according to claim 42 wherein said 
25 mammal is a human. 

44 . A method according to claim 42 wherein said 
regulatory protein is HIV-l Rev. 



45. A method according to claim 42 wherein said 
construct is selected from the group consisting of the 
construct of claim 20, 21, 34, 35, 36, and 37. 



35 



WO 93/20212 



PCT/US93/02908 



1/18 



9*9 

P17 p24 p7 



Pol 



B 



~H3D<a 



— m>n 



— flTk 



— QD< 



P1»0«0 (MV) 



piTgtg (hiv.1) 



ut - 




vpr vpu #BV 



H3D 



P17R 



rue 



H3D »»•• 



P1IR 



RRB 



HTTI P17M12J4 





P24 
PttStQ (HIV.1) 
P24I 



KTTIp17ii1234 

(731.1444) 



H7T1 P*7»1234 



WO 93/20212 PCT/US93/02908 



P17M1814 




p17Ml21«(7»1.1ff|l) 
p17MlSt4(7ai.1424) 
p17HiaS4(7«1.aiM) 



^ eiigonuetaeUdadinetBdfflutiBanwis 

□D<sJHil P.17«iaM<7«.,0.,,N8 
QD<lxxx| ^TW\ PJ7M12M 

P55BM12S4 

•17 M« ^ 



WO 93/20212 



PCT/US93/02908 



4/18 




Rev 



— 28S 



— 18S 



B 



I- ♦! 




Rtv 



— 28S 



18S 



Pa. 3 



WO 93/20212 ^ PCT/US93/02908 



«t» nc ,c, gco tc . ^, «, , >e ggg w ?m tM ^ t ^ ^ m lM K . 



M1 



egg 

«. c ? . «e ,c. m „ t ect W c cc, «. ,„ .« e « ... 9 , e tgt eM 4e , 
«» «. c, ct. cu ee. tee ct* ee , „ tM _ 

« c c c c c 

m M3 



ASSJStas * fw fif r ^ r rn Bt «« c """' — "i -r ««. Tr ... ... 

g*c ~ ff m w e » ... ^ n .„ m ... ... 

C TCC C c C C f ? ~ 

etc 



■ lca 73 * °* c W ut cag gtc age eM ut t*c 




— P17Q8Q 



R 3 . S 



WQ 93/20212 



6/18 



PCT/US93/02908 




WO 93/20212 



PCT/US93/02908 



7/18 



0P12O BP** _ 



sss RRE ass 



Jpi» gp4i 

i an f tnanffl pnusesd- 



gpi20 



0P120 



ARE *«8 



gpl20 gp4l 

nn — b — nz ma n »T^3m pnusews 



RRE 



gp120 ^ 

QD-e-Hzzz m — b — eJITI pbsimms 

RRE 



p120pA 
SpA 



P120R330 



RRE330 8pA 



[IIH-HZZZ 1 P120DR 

QjJ-ft—^ P120R270 

RREZ7D SpA 

NTr-fl I ZB-— P120HD345 

RRED345 6pA 



WO 93/20212 



PCT/US93/02908 



8/18 




WO 93/20212 PCT/US93/02908 

9/18 



3ft 



+ 
i 

~+ 
i 

+ 
i 



J. 



CD 

m 

CO 
CM 



K 



WO 93/20212 



10/18 



PCT/US93/02908 



o 

CM 

a 




an 



WO 93/20212 



11/18 



PCT/US93/02908 



Identification of INS regions within the 
env mRNA using the pl9 vector. 



FRAfiMFNTiSl7F WS EFFECT 

A 276 mm 7Ba4-7B59 none 

B 234 ma I 7884-7884,7827-7959 none 

C 323 7555-7884, 7927-7858 10X 

D 128 ■ tbsj-ww none 

^ 478 793»4416 to X 

F 362 l^BH &2004SB1 >100X 

G 330 ilH 726*7585 

E 668 ■■■■ 5523*190 4AV 

10X 



WQ 93/20212 



12/18 



PCT/US93/02908 




WO 53/20212 



13/18 



PCT/US93/02908 





1 


% 


CO 








■si 


s 




i 


S 












ro 




CO 


U 


CO 


co 




.U 




^ 


33 


33 




3 


O 


O 


O 


O 


33 


a 


3J 


3) 


W 


CO 


0) 


CO 


















o 


15 


© 





ro 

CO 




Ul 
*4 



i 




> 






w 




H 








g 








9 




At 




ft 




H* 




0 




» 




0 




M» 








ID 








a 




rr 








s 












Hi 




Hi 




0 




O 




ft 




n 




0 




Hi 




O 



to 
en 



H» O 



H 
O 
O 



X 
0 



■§ 

03 O 

to Hi 

o *d 
9 ro 



WO.93/20212 



14/18 



PCI7US93/02908 




WO 93/20212 



15/18 



PCT/US93/02908 




r«3. h 



A 



WO 93/20212 



16/18 



PCT/US93/02908 



1 7 < X > ACaXr WTTGGTCC CAAAAAAGAC AAGAGATCCT TGATCTGTGG AICTACCACA C ACAAGGC7A 
71 OTCCCTGAT ACACACCAGG GCCACCGATC AGATATCCAC TGACCTTTGC ATGGTGCTTC 

141 MSmSaC CAGT TGAACC AGAGCAAGTA GAAGAGGCCA AATAAGGAGA CAASAACAGC TTGTTACACC 
2U CaXSMCCA «aTG «aiG SACGACCCGG AGGGAGAAGT ATTAGTGTGG AAGITTGACA GCCICCTAGC 
281 "y^CGTnC ATCGCCCGAG MCWCMCC GGASTACTVC AAAGACTGCT GACATCGAGC TTTCIMAAG 
351 GSaCrrrCC6 CTO GGGaOT TCCA6GGAGG TGTGGCCTGG GCGGGACTCG GGAGTGGCGA GCCCIOGAT 
421 eCT » CMM * *«»6 CTGCT rTTTGCCTCT ACTGSGTCTC TCTCGTOa CCASATCTGA GCCT6GGAGC 
491 mCKGC ^ ^CTAG GGAAC CCACTGCm AGCCTCAAIA MGCITGCCT TGAGTGCTCA AAGTAGTCTG 
361 "CCCeTCTG TOT CISACT CTGGIAACTA GASAICCCTC ACACCCTTTT AGTCAGTGTG GAAAASCICT 
631 H**?™** CCCGAACAGG^acrKaAAG CGAAAGTAAA GCCAGAGGAS ATCTGTCCtf: 

BssHU (711) 
701 GCTTGCTGAAGCaGCGTCaCAGAGAt^ 

l»MstGI yAl aArgAI aSer Val LeuSer Q yGI yQI uLauAspArgTrp 

" 55ggEaa55555BS55ga^ 

wo wawr m mat yAl *Thf ProG» nA»pL«uA»nThrMit L« uAsnThr Val Q yQI yH 
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\ 6 * 3 " A ^*^ CGG1TOATA ^ 
^63> pTyrVal AspArgPhaTy fLysThr LeuArgAI aG uGl nAI aSar Gl nGI uVal LysAsnT rpVbi Thr Q u Thr 

1B9> LeuLeuVal (3 nAsnAI aAsnProAtpCysLysThr 1 1 aLeuLysAI aLeuGI yP roAl aAl aThr LeuQ uQ uM 

214» etMMThr Al aCysGl nQ yVal Gl yGI yProG ) yHi gLysAI aArgVal Leu 
Apal (1848) 

1841 CGAGGGGGGG CCCGGIACCT TTAAGACCAA TGACTIACAA GGCAGCTGIA GATCTtAGCC ACTTTTTAAA 

1911 AGAAAAGGGG GGACTGGAAG GGCIAAITCA CTCCCAAAGA AGACAAGAIA TCCTTGAICT GT GGAICiaC 

1981" CACACACAAG GCTACTTCCC TOATT66CA6 AACTTtCACAC CA6GGCCAG6 G6TCA6AIAT CCACTGA CCT 

2051 "G6MGST6 CIACAAGCIA GTACCA6TT6 AGCCAGA IAA GGIAGAAGAG GCCAAXAAAG GAGAGAACAC 

2121 °^"»"» CACCCTGIGA GCCTGCATGG AATG6AT6AC CCTGAGAGAG AAGTOTTAGA GTGGAGGHT 

2191 <aCASCCSCC TAGCATTTCA TCACGTGGCC CGAGAGCTGC ATCCGGAGTA CTTCAA6AAC TGCTGACATC 

2261 SAGwW^ CAAGGGACIT TCCGCTGGGG ACTTTCCAGG GAGGCGIGGC CTGGGCGGGA CTG6GGA6TG 

2331 GCtaSCCCTC AGATGCTGCA TATAAGCAGC TGCTTTTTGC CTCTACTGGC TCTCTCTGCT tAGACCAGAT 

2401 CtGAGCCTGG GAGCTCTCTG GCTAACIA66 GAACCCACIC CTTAAGCCTC AAZAAASCTT GCCTTGASTG 

2471 "TCAAGIAG TCTSTGCCCG ICTGITGTGT GACTCTGGTA ACTAGAGATC CCTCA6ACCC UTTAGTCAG 

2S41 TGTGGAAAAT CICIAGCAC C CCCCAGGAGG TAGAGGTTGC AGTGAGCCAA GATCGCGCCA CTGCATTCCA 

l££ CTGTCBVAAA TAAIAATAAT AAGTTAAGGG MTAAATAT ATTIAIACAT 

2?5 JESSES ?ffTT° CSCT 6SGMCACTS OCTCACACCI GCGCCCGGCC CTTTGGGAGG 

mm S^SSSS JS^SS GAGTTOGGA GTTCCAGACC AGCCTGACCA ACATGGAGAA ACCCCTTCTC 
nt/^T!" aTTTBlTGT GTAITTtATT CACAGGIAX? TCTGGAAAAC TGAAACTGTT 

ISS CAAGAATCAX CAGCACAGAG GAAGACTTCT GXGATCAAAT GTGGTGSGAG 

3o£ S^S^ ^f^^ *SITCTGCCG CAGACTCGGC GGGTGTCC1T CGGTTCAGT7 

£5 fS^Sr 2255£ £? GGTC * ( » CCACAGGGTG AGGGCTCAGT CCCCAAGACA TMACACCCA 

SS ^ aCGC CT5CTG <*« 6GCAGAGCCG AZTCACCAAG ACGGGAAXIA 

3zll J 0 " 3 " 3 "* 0 GSCTGTCCGG GAGAACG6AG TTCZAXTATG ACTCAAATCA 

I m ATCAGAGTII RAAGGAXAA CITAGTGTGT AGGGGGCCAG TGAGTTGGAG 

33M AWT6KCTT TTGCGCCGAG TCAGTTCCTG GGTGGGGGCC ACAAGAXCGG 

a<n £5^2^ r ° aCaATCC OGSGGTGCOk GCTGATCCAT GGAGTGCAGG GTCTGCAAAA TAKTCAAGC 

All «TIACCCCA ggaacaattt ggggaaggic AGAATCTTGT 

3321 AGCCTGTAGC IGCATGACTC CtAAACCAIA ATTTCITTTT lUlllTm UHiAlUU 1 IGAGACAGGG 

llll rOCC^GGC TGGAGTGCAG TGGTGCAATC ACAGCTOVCT 'gScCTA GAGCGGCCGC 

«S ^SSS SESSSS TCK:CCIAI * «5rGA51CGIA ITACAATTCA CTGGCCGTCG TTTTACAACG 
MM SSSSS GCGTTACCCA ACTTAATCGC CTTGCAGCAC ATCCCCCITT CGCCAGCTGG 

lm SSSSSf AA ? hGGC ^ CG «=CGA««: CCTTCCCAAC AGTTGCGCAG CCTGAATGGC GAATGGCGCG 
£££2^ gffff?" "G* 13 **" «5CGIXAAA TTCTTGTTAA ATCAGCTCAT TTTTTAACCA 
?S m ^S^** ^t:™ 3 * ATCAAMGAA TAGACCGAGA TAGGGTTGAG TGTTGTXCCA 

ISfii A"***"* 1 GTGGACTCCA ACGTCAAAGG GCGAAAAACC GTCXA1CAGG 

1?S S^ISSS ^E?^ CCATCACCCT AATCAAGTTT TTtGGGGICG AGGTGCCGZA AAGCACZAAA 
^ "* GGGAGCC CCCGWTTAG AGCTTGACGG GGAAAGCCGG CGAACGTGGC GAGAAAGGAA 

5^ G 5* GC GGGCGCIAGG GCGCTGGCAA GIGXAGCGGT CACGCTGCGC GIAACCACCA 
GCITAATGCG CCGCTACAGG GCGCGTCCCA GGTGGCACTT TTCGGGGAAA TGTGCGCGGA 
43G1 ACCCCDOTT GTTTAITTTT CTAAAXACAT TCAAAZAXGT ATCCGCTCAT GAGACAAIAA CCCT6AXAAA 

P 3 . W t 
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4S01 ££££££ AG6W « 3 "^ TGAGTATTCA ACATTTCCGT GTCGCCCTTA TTCCCTnTT 

" 01 ~iS3 TGCCITCCTG TTrmCTCA CCCAGAAACG CTGGTGAAAG TAAAAGATGC TGAAGATCAG 

HrS^-SS^S 11 * CAT aaACT6 GATCTCAACA GCGGZAAGAT CCTTGAGAGT HTCGCCCCG 

AAGAAC£31TT TCOLATRaTfi i » **, ■ . 



4571 
4641 
4711 
4781 
4851 



w — «r,*w*v*»nw» tMMiiW^ UUi.rtafttaA(jT TTTCGCCCCG 

-fiJSS" 6 MaCITm «GTTCIGCT ATGTCGCGCG GXATXATCCC GIATTGACGC 
££££££ fX"**"* CIA1TCICAG AATGACXTGG ITGAGIACTC ACCACTCACA 

^ G ^ TGG CA«aa«A AGAGAATnT GCAGTGCTGC CAIAACCATG AGTGABIACA 
4«i CTTAC1TCTG AGAAC6ATCG GAGGACCGAA GGAGCIAACC GCTTTTTTGC ACAACATGGG 

J£ A^S SSIS jS?^ ^^GOfiCIG AAXGAAGCCA TACCAAACGA e££S 
4991 ACCAC6ATGC CTGTAGCAAI GGCAACAACG 1TGCGCAAAC TATtAACIGG CGAACXACTT ACTCTAfiCTT 
I m CGGATAAAGT j^SScS! ^SgS 

f 2W gIScS ^fPr! 25=?»~ «WGGICTC GCGGIAXCAT 

moi GSSCCAGATG GTAAGCCCTC CCGXATCGTA GmiCIACA CGACGGGCAG ICAGGOttCT ATGGMfiAae 

f Li SZSSS ££22°* MMGTCCCT OCXGAmA GCa££S£ SSaS 
££ S^SJ ttWACIra g^TTT AAAAGGAICI AGGTGAAGAT CCTOITGAT 

<J« t^TSSS TOACGTGA5 U 11U.1M ACTSASCSTC AGACCCCGIA GAAttGATCA 

t«l f 00 " 6 " TITmCT5C GCGIAATCTG CTCCXTGCAA ACWuSc OCCgS 

SS51 AGCGGIGGTT TGTTTGCCGG AICAAGAGCT ACCAACTCTT TTTCCGAAGG BACTGGCTT CMEAGAGeG 

l«i ^f 6 " 5 22£5? 6 ««ag«ag SSSaS gS££ 

5691 CEACAIACCT CGCTCTCCIA ATCCTGRAC CAGTGGCTGC TGCCAGTGGC GATAAGTCGT GTCTOC^ 

SS SESsI 22£S?* =«*=««C TACAGCGTGA GCXATGAGAA AgSSgC 

5971 G^SgI g^"*" CAGGGWGGA ACAGGAGAGC GCACGAGGGA 

S£ ^™ C ^ 5 SS ^SiSSE?" GGTATCTTTA XAGTCCTGTC GGGTTTCGCC ACCTCIGACT TGAGCGTCGA 
Sf i GCTCGTCAGG GGGGCGGAGC CtATGGAAAA ACGCCAGCAA CGCGGCCITT TTMGGHSC 

T^rrnv^m Sf?.?^^^ GCTCACATCT TCTTTCCTGC GTXATCCCCT GARCTGTGG AIAACCgSt 
®^ T S G f TG "*CCGCTCG CCGCAGCCGA ACGACCGAGC KMCGAGTC ACTGMCGAG 
ACGAC&GGTT TereRarrltr *®"^AAACCG CCTCTCCCCG CGCGZTSGCC GATTCATTAA TGCAGCXGGC 
GGCACCCCAG GCTtoStt f^nw^^^ GIGAGOGCAA CGCAATBUff GTGASXXAGC TCACTCATTA 
GGCACCCOG GCTTXACACT TTATGCITCC GGCTCG1MG TTCTGTGGAA TTGTGAGCGG AXAACAART 
CAOCA^ACAGCnTGA CCATGAITAC GCCAAGCTCG GAATTAACCC G^S 



6111 
6181 
6251 
6321 
6391 
6461 



6531 
6601 



SSSSS? S^ffE 01 «GiGnwcc iGAcasro ccttccito 

fiffn SSf^ ^■^■■ 1U; C 1 * 6 ""* 6 TTACTTCCGT TCAGCCAAGG TCIGAAACXA GGTGCGCACA 

S2 ™« SSS GcmRCAGG GGGrraacA cagtgcaccc tgacSct 

r ?"- 1 " 1 ^ GGGGGITTAT CACATTGCAC CCTGACAGTC GTCAGCCICA CAGGGGGTTT ATCACAGTGC 
TCMTCCATT TGAITOCAA TTTTTTIAGT CTC^OGTO SSaCTTGT S 

aSSS SSfSS GGTGMGACT ACCTCAGTTG GAXCTCCACA GGTCACAGTG 

7091 ^ACTG^ JSS SSS^SE aCCftCRM « GSCCGCCCTC CACGTSCACA TGGCCGGAGG 
■m« ffS^rl""* TCGSft g 6 ? CC AAGCACACCT GCGCATCAGA GTCCTTGGTG TOGAGGGAGG 6ACCAGCGCA 
7161 GCTTCCAGCC AICCACCTGA IGAACAGAAC CXAG6GAAAG ^GTTCT Jc^S SSa^ 
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